-
Notifications
You must be signed in to change notification settings - Fork 5
Search API
This document was generated from 'src/documentation/print-search-wiki.ts' on 2025-02-24, 06:29:08 UTC presenting an overview of flowR's search API (v2.2.10, using R v4.4.0). Please do not edit this file/wiki page directly.
This page briefly summarizes flowR's search API which provides a set of functions to search for nodes in the Dataflow Graph and the
Normalized AST of a given R code (the search will always consider both, with respect to your search query).
Please see the Interface wiki page for more information on how to access this API.
Within code, you can execute a search using the runSearch
function.
For an initial motivation, let's have a look at the following example:
Q.var("x")
Search Visualization
flowchart LR
0("<b>get</b>(filter: #123;#34;name#34;#58;#34;x#34;#125;)<br/>_generator_")
In the code:
x <- x * x
JSON Representation
{
"generator": {
"type": "generator",
"name": "get",
"args": {
"filter": {
"name": "x"
}
}
},
"search": []
}
Show Results
The query returns the following vetices (all references to x
in the code):
0 ('x') at L1.1, 1 ('x') at L1.6, 2 ('x') at L1.10
The search required 20.65 ms (including parsing and normalization and the query) within the generation environment.
The returned results are highlighted thick and blue within the dataflow graph:
flowchart LR
1(["`#91;RSymbol#93; x
(1)
*1.6*`"])
style 1 stroke:teal,stroke-width:7px,stroke-opacity:.8;
2(["`#91;RSymbol#93; x
(2)
*1.10*`"])
style 2 stroke:teal,stroke-width:7px,stroke-opacity:.8;
3[["`#91;RBinaryOp#93; #42;
(3)
*1.6-10*
(1, 2)`"]]
0["`#91;RSymbol#93; x
(0)
*1.1*`"]
style 0 stroke:teal,stroke-width:7px,stroke-opacity:.8;
4[["`#91;RBinaryOp#93; #60;#45;
(4)
*1.1-10*
(0, 3)`"]]
3 -->|"reads, argument"| 1
3 -->|"reads, argument"| 2
0 -->|"defined-by"| 3
0 -->|"defined-by"| 4
4 -->|"argument"| 3
4 -->|"returns, argument"| 0
(The analysis required 8.59 ms (including parse and normalize, using the r-shell engine) within the generation environment.)
This returns all references to the variable x
in the code.
However, the search API is not limited to simple variable references and can do much more.
For example, let's have every definition of x
in the code but the first one:
Q.var("x").filter(VertexType.VariableDefinition).skip(1)
Search Visualization
flowchart LR
0("<b>get</b>(filter: #123;#34;name#34;#58;#34;x#34;#125;)<br/>_generator_") --> 1["<b>filter</b>(filter: #34;variable#45;definition#34;)<br/>_transformer_"] --> 2["<b>skip</b>(count: 1)<br/>_transformer_"]
In the code:
x <- x * x
print(x)
x <- y <- 3
print(x)
x <- 2
JSON Representation
{
"generator": {
"type": "generator",
"name": "get",
"args": {
"filter": {
"name": "x"
}
}
},
"search": [
{
"type": "transformer",
"name": "filter",
"args": {
"filter": "variable-definition"
}
},
{
"type": "transformer",
"name": "skip",
"args": {
"count": 1
}
}
]
}
Show Results
The query returns the following vetices (all references to x
in the code):
9 ('x') at L3.1, 18 ('x') at L5.1
The search required 21.32 ms (including parsing and normalization and the query) within the generation environment.
The returned results are highlighted thick and blue within the dataflow graph:
flowchart LR
1(["`#91;RSymbol#93; x
(1)
*1.6*`"])
2(["`#91;RSymbol#93; x
(2)
*1.10*`"])
3[["`#91;RBinaryOp#93; #42;
(3)
*1.6-10*
(1, 2)`"]]
0["`#91;RSymbol#93; x
(0)
*1.1*`"]
4[["`#91;RBinaryOp#93; #60;#45;
(4)
*1.1-10*
(0, 3)`"]]
6(["`#91;RSymbol#93; x
(6)
*2.7*`"])
8[["`#91;RFunctionCall#93; print
(8)
*2.1-8*
(6)`"]]
11{{"`#91;RNumber#93; 3
(11)
*3.11*`"}}
10["`#91;RSymbol#93; y
(10)
*3.6*`"]
12[["`#91;RBinaryOp#93; #60;#45;
(12)
*3.6-11*
(10, 11)`"]]
9["`#91;RSymbol#93; x
(9)
*3.1*`"]
style 9 stroke:teal,stroke-width:7px,stroke-opacity:.8;
13[["`#91;RBinaryOp#93; #60;#45;
(13)
*3.1-11*
(9, 12)`"]]
15(["`#91;RSymbol#93; x
(15)
*4.7*`"])
17[["`#91;RFunctionCall#93; print
(17)
*4.1-8*
(15)`"]]
19{{"`#91;RNumber#93; 2
(19)
*5.6*`"}}
18["`#91;RSymbol#93; x
(18)
*5.1*`"]
style 18 stroke:teal,stroke-width:7px,stroke-opacity:.8;
20[["`#91;RBinaryOp#93; #60;#45;
(20)
*5.1-6*
(18, 19)`"]]
3 -->|"reads, argument"| 1
3 -->|"reads, argument"| 2
0 -->|"defined-by"| 3
0 -->|"defined-by"| 4
4 -->|"argument"| 3
4 -->|"returns, argument"| 0
6 -->|"reads"| 0
8 -->|"reads, returns, argument"| 6
10 -->|"defined-by"| 11
10 -->|"defined-by"| 12
12 -->|"argument"| 11
12 -->|"returns, argument"| 10
9 -->|"defined-by"| 12
9 -->|"defined-by"| 13
13 -->|"argument"| 12
13 -->|"returns, argument"| 9
15 -->|"reads"| 9
17 -->|"reads, returns, argument"| 15
18 -->|"defined-by"| 19
18 -->|"defined-by"| 20
20 -->|"argument"| 19
20 -->|"returns, argument"| 18
(The analysis required 12.31 ms (including parse and normalize, using the r-shell engine) within the generation environment.)
In summary, every search has two parts. It is initialized with a generator (such as Q.var('x')
)
and can be further refined with transformers or modifiers.
Such queries can be constructed starting from the Q
object (backed by FlowrSearchGenerator
) and
are fully serializable so you can use them when communicating with the Query API.
We offer the following generators:
-
FlowrSearchGenerator::all
Returns all elements (nodes/dataflow vertices) from the given data. -
FlowrSearchGenerator::criterion
Returns all elements that match the givencriteria
(e.g.,criterion('2@x', '3@<-')
, to retrieve the first use ofx
in the second line and the first<-
assignment in the third line). This will throw an error, if any criteria cannot be resolved to an id. -
FlowrSearchGenerator::from
Initialize a search query with the given elements. This is not intended to serialize well wrt. the nodes, seeFlowrSearchGenerator.criterion
for a serializable alternative (passing the ids with$id
). -
FlowrSearchGenerator::get
Returns all elements that match the givenfilters
. You may pass a negative line number to count from the back. Please note that this is currently only working for single files, it approximates over the nodes, and it is not to be used for "production". -
FlowrSearchGenerator::id
Short form ofget
with only theid
filter:get({id})
. -
FlowrSearchGenerator::loc
Short form ofget
with only theline
andcolumn
filters:get({line, column})
. -
FlowrSearchGenerator::var
Short form ofget
with only thename
filter:get({name})
. -
FlowrSearchGenerator::varInLine
Short form ofget
with only thename
andline
filters:get({name, line})
.
Likewise, we have a palette of transformers and modifiers:
-
FlowrSearchBuilder::build
Construct the final search (this may happen automatically with most search handlers). -
FlowrSearchBuilder::filter
only returns the elements that match the given filter. -
FlowrSearchBuilder::first
first either returns the first element of the search or nothing, if no elements are present. -
FlowrSearchBuilder::index
index returns the element at the given index if it exists -
FlowrSearchBuilder::last
last either returns the last element of the search or nothing, if no elements are present. -
FlowrSearchBuilder::merge
merge combines the search results with those of another search. -
FlowrSearchBuilder::select
select returns only the elements at the given indices. -
FlowrSearchBuilder::skip
skip returns all elements of the search except the firstcount
ones. -
FlowrSearchBuilder::tail
tail returns all elements of the search except the first one. -
FlowrSearchBuilder::take
take returns the firstcount
elements of the search.
Every search (and consequently the search pipeline) works with an array of FlowrSearchElement
(neatly wrapped in FlowrSearchElements
).
Hence, even operations such as .first
or .last
return an array of elements (albeit with a single or no element).
The search API does its best to stay typesafe wrt. to the return type and the transformers in use.
In addition, it offers optimizer passes to optimize the search pipeline before execution.
They are executed with .build
which may happen automatically, whenever you want to run a search using runSearch
.