Kevin Mader
April 13, 2017
For students wishing to use R or RStudio (recommended), the exercises for this and the next exercises can also be done with these tools.
A starting script for reading in data, finding nearest neighbors, and plotting the results is available here (https://gist.github.com/kmader/9949532) and many of the other functions / examples are available in the course handout itself.
In addition many of the more advanced topics like territory, Delaunay neighbors, and distribution are only available as R code since their implementation didn't exist in KNIME
- Follow the instructions on the [wiki](/~https://github.com/kmader/Quantitative-Big-Imaging-2015/wiki/KNIME-Setup#configure-path-to-r-within-knime to setup) to get R setup correctly within KNIME
- Within R (start it using the R command) install the Grammar of Graphics, 'Delaunay Triangulation and Voronoi Tesselation' packages
install.packages(c("plyr","ggplot2","deldir"))
You might need to restart KNIME afterwards.
-
CSV Reader
-
Reads in CSV files as a table
-
Select the file to laod using the browse button
-
The defaults are fine for our files, except uncheck 'Has Row Header' since this will read the first column as the RowId
-
For reading XLS or other files use the other readers in the same folder
-
It can be combined with the CSV Writer to save and then load results from the 'Segment Features' or other shape analysis
-
Ideally they (Segment Features and Row Filter in the image below) are directly connected, but for testing it can be much quicker to write a CSV file out and then read it in another workflow
-
R Snippet (Table)
-
The R Snippet node is used to operate on a table using R, in the case of the (R Nearest Neighbor workflow), it calculates the nearest neighbor for each point
-
The node takes in one standard KNIME Data Table and outputs one standard KNIME Data Table.
-
Inside the data table is converted to an R Data-frame called
knime-in
which can be operated on using normal R commands -
Configure
-
The 'Eval Script' button can be used to run the script and print the output to the window below
-
R View (Table)
-
The R View is for plotting a KNIME Data Table using R
-
It takes a standard Data table as an input and outputs a special KNIME Image Node (not the same as an KNIME Image Processing Image, but can be converted, see the GenerateImage metanode in the VoronoiTesselation
- The workflows (or their starts) are available here.
Given our definition of nearest neighbors from the lecture, think about how you would implement it yourself (just conceptionally, for each point ...)
Now try to understand what the above workflow does. Particularly important is the GroupBy Node. To fully understand this node, you need to look at both the main panel (Groups) and the Manual Aggregation panel.
- What would changing the other values for 'Aggregation' get the result?
Each of these samples was measured from a different treatment group, calculate general statistics about the same and try and determine what transformation was applied from the normal (sample-a) to get to this point. Comparing histograms of nearest neighbor positions might be a good starting point.
There are two workflows available the first is the 'Nearest Neighbors' which is entirely in KNIME (and slow). The second is the 'R Nearest Neighbors' which uses R to calculate the nearest neighor distances (faster).
The following datasets should be loaded one at a time (you can also load it using a loop if you use the 'Table Row To Variable Loop Start' and flow variables to change the name of the file in the 'CSV Reader', see Point-based/Delaunay Triangulation for an example of how to do this)
- [sample-a.csv download](07-files/sample-a.csv?raw=true) - [sample-b.csv download](07-files/sample-b.csv?raw=true) - [sample-c.csv download](07-files/sample-c.csv?raw=true) - [sample-d.csv download](07-files/sample-d.csv?raw=true)- Modify the workflow to exclude all neighbors which are further than 0.05 away and replace the distance with 0.05 (Row Filter node?)
- How would you find the nearest 5 neighbors instead of just the nearest? (Explanation or code)?
These layers are distinguished by Volume
- [simple_layer.csv download](07-files/simple_layer.csv?raw=true) - [tilted_layer.csv download](07-files/tilted_layer.csv?raw=true)- Files have prefix spacing_
- Since these layers are distiguished by the spacing between points, you should perform a K-Means clustering.
- The number of groups and values used to calculate the K-means can be adjusted in the panel
- Additional math blocks can be used to rescale other values (x and y for example for inclusion)
- From the K-means classification how can you count the layers automatically?
- What benefit might adding the position (x,y) into the results have?
- Would using N-nearest neighbors instead of nearest neighbor improve the results, if so for which kind of samples and why?
The goal of this task is to run a voronoi tesselation on the image to fill in the surrounding areas.
The results of the workflow should look like this
- How can you calculate territory from this voronoi tesselation?
- Create a histogram of the local density?
Using the analysis saved here located in the course directory along with the sample script. To classify cells into groups, right now we just use nearest neighbor distance and area, how does adding additional parameters change the result
3D data analyzed with KNIME or in ImageJ using the 3D Object Analyzer function can also be analyzed using similar workflows. Can you modify the workflow so it works for 3D positions (x,y,z) as well (in KNIME or for ambitious students in R)?
- Which of the two samples are more aligned? Why?
- Can you find a method for calculating the alignment better appropriately for the second sample?