Skip to content

Search API

github-actions[bot] edited this page Feb 24, 2025 · 1 revision

This document was generated from 'src/documentation/print-search-wiki.ts' on 2025-02-24, 06:29:08 UTC presenting an overview of flowR's search API (v2.2.10, using R v4.4.0). Please do not edit this file/wiki page directly.

This page briefly summarizes flowR's search API which provides a set of functions to search for nodes in the Dataflow Graph and the Normalized AST of a given R code (the search will always consider both, with respect to your search query). Please see the Interface wiki page for more information on how to access this API. Within code, you can execute a search using the runSearch function.

For an initial motivation, let's have a look at the following example:

Q.var("x")
Search Visualization
flowchart LR
0("<b>get</b>(filter: #123;#34;name#34;#58;#34;x#34;#125;)<br/>_generator_")
Loading

In the code:

x <- x * x
JSON Representation
{
  "generator": {
    "type": "generator",
    "name": "get",
    "args": {
      "filter": {
        "name": "x"
      }
    }
  },
  "search": []
}
Show Results

The query returns the following vetices (all references to x in the code): 0 ('x') at L1.1, 1 ('x') at L1.6, 2 ('x') at L1.10

The search required 20.65 ms (including parsing and normalization and the query) within the generation environment.

The returned results are highlighted thick and blue within the dataflow graph:

flowchart LR
    1(["`#91;RSymbol#93; x
      (1)
      *1.6*`"])
    style 1 stroke:teal,stroke-width:7px,stroke-opacity:.8; 
    2(["`#91;RSymbol#93; x
      (2)
      *1.10*`"])
    style 2 stroke:teal,stroke-width:7px,stroke-opacity:.8; 
    3[["`#91;RBinaryOp#93; #42;
      (3)
      *1.6-10*
    (1, 2)`"]]
    0["`#91;RSymbol#93; x
      (0)
      *1.1*`"]
    style 0 stroke:teal,stroke-width:7px,stroke-opacity:.8; 
    4[["`#91;RBinaryOp#93; #60;#45;
      (4)
      *1.1-10*
    (0, 3)`"]]
    3 -->|"reads, argument"| 1
    3 -->|"reads, argument"| 2
    0 -->|"defined-by"| 3
    0 -->|"defined-by"| 4
    4 -->|"argument"| 3
    4 -->|"returns, argument"| 0
Loading

(The analysis required 8.59 ms (including parse and normalize, using the r-shell engine) within the generation environment.)

This returns all references to the variable x in the code. However, the search API is not limited to simple variable references and can do much more.

For example, let's have every definition of x in the code but the first one:

Q.var("x").filter(VertexType.VariableDefinition).skip(1)
Search Visualization
flowchart LR
0("<b>get</b>(filter: #123;#34;name#34;#58;#34;x#34;#125;)<br/>_generator_") --> 1["<b>filter</b>(filter: #34;variable#45;definition#34;)<br/>_transformer_"] --> 2["<b>skip</b>(count: 1)<br/>_transformer_"]
Loading

In the code:

x <- x * x
print(x)
x <- y <- 3
print(x)
x <- 2
JSON Representation
{
  "generator": {
    "type": "generator",
    "name": "get",
    "args": {
      "filter": {
        "name": "x"
      }
    }
  },
  "search": [
    {
      "type": "transformer",
      "name": "filter",
      "args": {
        "filter": "variable-definition"
      }
    },
    {
      "type": "transformer",
      "name": "skip",
      "args": {
        "count": 1
      }
    }
  ]
}
Show Results

The query returns the following vetices (all references to x in the code): 9 ('x') at L3.1, 18 ('x') at L5.1

The search required 21.32 ms (including parsing and normalization and the query) within the generation environment.

The returned results are highlighted thick and blue within the dataflow graph:

flowchart LR
    1(["`#91;RSymbol#93; x
      (1)
      *1.6*`"])
    2(["`#91;RSymbol#93; x
      (2)
      *1.10*`"])
    3[["`#91;RBinaryOp#93; #42;
      (3)
      *1.6-10*
    (1, 2)`"]]
    0["`#91;RSymbol#93; x
      (0)
      *1.1*`"]
    4[["`#91;RBinaryOp#93; #60;#45;
      (4)
      *1.1-10*
    (0, 3)`"]]
    6(["`#91;RSymbol#93; x
      (6)
      *2.7*`"])
    8[["`#91;RFunctionCall#93; print
      (8)
      *2.1-8*
    (6)`"]]
    11{{"`#91;RNumber#93; 3
      (11)
      *3.11*`"}}
    10["`#91;RSymbol#93; y
      (10)
      *3.6*`"]
    12[["`#91;RBinaryOp#93; #60;#45;
      (12)
      *3.6-11*
    (10, 11)`"]]
    9["`#91;RSymbol#93; x
      (9)
      *3.1*`"]
    style 9 stroke:teal,stroke-width:7px,stroke-opacity:.8; 
    13[["`#91;RBinaryOp#93; #60;#45;
      (13)
      *3.1-11*
    (9, 12)`"]]
    15(["`#91;RSymbol#93; x
      (15)
      *4.7*`"])
    17[["`#91;RFunctionCall#93; print
      (17)
      *4.1-8*
    (15)`"]]
    19{{"`#91;RNumber#93; 2
      (19)
      *5.6*`"}}
    18["`#91;RSymbol#93; x
      (18)
      *5.1*`"]
    style 18 stroke:teal,stroke-width:7px,stroke-opacity:.8; 
    20[["`#91;RBinaryOp#93; #60;#45;
      (20)
      *5.1-6*
    (18, 19)`"]]
    3 -->|"reads, argument"| 1
    3 -->|"reads, argument"| 2
    0 -->|"defined-by"| 3
    0 -->|"defined-by"| 4
    4 -->|"argument"| 3
    4 -->|"returns, argument"| 0
    6 -->|"reads"| 0
    8 -->|"reads, returns, argument"| 6
    10 -->|"defined-by"| 11
    10 -->|"defined-by"| 12
    12 -->|"argument"| 11
    12 -->|"returns, argument"| 10
    9 -->|"defined-by"| 12
    9 -->|"defined-by"| 13
    13 -->|"argument"| 12
    13 -->|"returns, argument"| 9
    15 -->|"reads"| 9
    17 -->|"reads, returns, argument"| 15
    18 -->|"defined-by"| 19
    18 -->|"defined-by"| 20
    20 -->|"argument"| 19
    20 -->|"returns, argument"| 18
Loading

(The analysis required 12.31 ms (including parse and normalize, using the r-shell engine) within the generation environment.)

In summary, every search has two parts. It is initialized with a generator (such as Q.var('x')) and can be further refined with transformers or modifiers. Such queries can be constructed starting from the Q object (backed by FlowrSearchGenerator) and are fully serializable so you can use them when communicating with the Query API.

We offer the following generators:

  • FlowrSearchGenerator::all
    Returns all elements (nodes/dataflow vertices) from the given data.
  • FlowrSearchGenerator::criterion
    Returns all elements that match the given criteria (e.g., criterion('2@x', '3@<-'), to retrieve the first use of x in the second line and the first <- assignment in the third line). This will throw an error, if any criteria cannot be resolved to an id.
  • FlowrSearchGenerator::from
    Initialize a search query with the given elements. This is not intended to serialize well wrt. the nodes, see FlowrSearchGenerator.criterion for a serializable alternative (passing the ids with $id).
  • FlowrSearchGenerator::get
    Returns all elements that match the given filters . You may pass a negative line number to count from the back. Please note that this is currently only working for single files, it approximates over the nodes, and it is not to be used for "production".
  • FlowrSearchGenerator::id
    Short form of get with only the id filter: get({id}).
  • FlowrSearchGenerator::loc
    Short form of get with only the line and column filters: get({line, column}).
  • FlowrSearchGenerator::var
    Short form of get with only the name filter: get({name}).
  • FlowrSearchGenerator::varInLine
    Short form of get with only the name and line filters: get({name, line}).

Likewise, we have a palette of transformers and modifiers:

Every search (and consequently the search pipeline) works with an array of FlowrSearchElement (neatly wrapped in FlowrSearchElements). Hence, even operations such as .first or .last return an array of elements (albeit with a single or no element). The search API does its best to stay typesafe wrt. to the return type and the transformers in use. In addition, it offers optimizer passes to optimize the search pipeline before execution. They are executed with .build which may happen automatically, whenever you want to run a search using runSearch.

Clone this wiki locally