-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve falco files loading performance #2151
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave a first round of comments. This looks great, specially the evttype part! Can you remove the draft status, so that the CI and tests can run?
@@ -21,6 +21,124 @@ limitations under the License. | |||
#include <set> | |||
#include <memory> | |||
|
|||
class falco_event_types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!
userspace/engine/rule_loader.cpp
Outdated
@@ -347,8 +364,9 @@ static shared_ptr<ast::expr> parse_condition( | |||
} | |||
catch (const sinsp_exception& e) | |||
{ | |||
throw falco_exception("Compilation error when compiling \"" | |||
+ condition + "\": " + to_string(p.get_pos().col) + ": " + e.what()); | |||
throw falco_exception("Compilation error when compiling \n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Watch out for conflicts with #2098.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not solved, the changes of #2098 introduced a new form of error reposting. We should throw a rule_loader::rule_load_exception
with proper err code here.
Should we bother changing any of the json_evt code at all, given that it's effectively dead with the transition of k8s audit support to a plugin? |
it will exist in the code for a while, sysdig might use it for some time as well. |
userspace/engine/rule_loader.cpp
Outdated
@@ -347,8 +364,9 @@ static shared_ptr<ast::expr> parse_condition( | |||
} | |||
catch (const sinsp_exception& e) | |||
{ | |||
throw falco_exception("Compilation error when compiling \"" | |||
+ condition + "\": " + to_string(p.get_pos().col) + ": " + e.what()); | |||
throw falco_exception("Compilation error when compiling \n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not solved, the changes of #2098 introduced a new form of error reposting. We should throw a rule_loader::rule_load_exception
with proper err code here.
userspace/engine/rule_loader.cpp
Outdated
@@ -21,7 +21,12 @@ limitations under the License. | |||
#include "filter_evttype_resolver.h" | |||
#include "filter_warning_resolver.h" | |||
#include <version.h> | |||
#include <sstream> | |||
|
|||
#ifndef _WIN32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not enough and that's also the reason why the CI builds are failing. We have a very weird setup in the libraries for regex support that depends on:
- old compiler versions, that don't support
<regex>
- inclusion or not of
oniguruma
in the build, which is a dependency ofjq
To solve all the combinations, this trick worked for me:
- /~https://github.com/falcosecurity/libs/blob/d08daff1bf69c2388b7474c1fed96bd6e98b900f/userspace/libsinsp/filter/parser.cpp#L35
- /~https://github.com/falcosecurity/libs/blob/d08daff1bf69c2388b7474c1fed96bd6e98b900f/userspace/libsinsp/CMakeLists.txt#L272 (I think you should be able to use the
USE_BUNDLED_ONIGURUMA
definition from libsinsp's cmake file)
Mind that this would also cause the regex standard used in the code to differ, depending on what regex library we are building with. So you need to write two copies of each regex, and also handle them in two different ways:
- /~https://github.com/falcosecurity/libs/blob/d08daff1bf69c2388b7474c1fed96bd6e98b900f/userspace/libsinsp/filter/parser.cpp#L46
- /~https://github.com/falcosecurity/libs/blob/d08daff1bf69c2388b7474c1fed96bd6e98b900f/userspace/libsinsp/filter/parser.cpp#L532
Good news is that the CI and the tests won't pass until this is addressed: we have different build types that help us test all the cases.
Bad news is that this is truly terrible, and I really hope we'll be able to do better than this as soon as possible. The bottleneck is the presence of oniguruma
brought in by jq
, we we manage to remove that we should be able to just stick to one regex standard (posix I guess).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dropping fix for the list inserter altogether - I don't have resources to fight regex portability here.
@jasondellaluce keep in mind that the current list expansion routine is quite inefficient - it keeps on looking for all available lists (100s) in the continuously growing string and does it from the string beginning each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, that's what we had in the past with Lua as well. This part will need to be substituted with proper AST traversal+substitutions in the future, however that would introduce ruleset breaking changes so it's something that we'll tackle in the future.
LGTM label has been added. Git tree hash: ecf2d1a1e3036d7189b2907fc8bffbfa36438a8b
|
/milestone 0.33.0 |
@VadimZy now we introduced RE2. Let me know if you intend to re-introduce your |
…rule sets - replace std::set<uint16_t> with fixed size vector in event types propagation - rework lists expansion by replacing repetitive string::find in constantly growing expansion string with regex tokenization - improve json_event parsing by moving const initializations into static routines Signed-off-by: VadimZy <vadim.zyarko@sysdig.com>
Signed-off-by: VadimZy <vadim.zyarko@sysdig.com>
Signed-off-by: VadimZy <vadim.zyarko@sysdig.com>
Signed-off-by: VadimZy <vadim.zyarko@sysdig.com>
reverting to the inefficient code. Signed-off-by: VadimZy <vadim.zyarko@sysdig.com>
1dd9c5d
to
9bbc564
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rebased this on master for you. LGTM, thanks!
/approved
LGTM label has been added. Git tree hash: 074920724c1a8793867fd771a65745ee5e6bfda1
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jasondellaluce, leogr, VadimZy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
Any specific area of the project related to this PR?
/area engine
What this PR does / why we need it:
Improve rules loading/parsing performance
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: