Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: new(userspace/libsinsp): MVP CountMinSketch Powered Probabilistic Counting and Filtering #1453

Closed
wants to merge 1 commit into from

Conversation

incertum
Copy link
Contributor

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind documentation

/kind failing-test

/kind feature

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area API-version

/area build

/area CI

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-gvisor

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap-engine-nodriver

/area libscap-engine-noop

/area libscap-engine-source-plugin

/area libscap-engine-savefile

/area libscap-engine-udig

/area libscap

/area libpman

/area libsinsp

/area tests

/area proposals

Does this PR require a change in the driver versions?

/version driver-API-version-major

/version driver-API-version-minor

/version driver-API-version-patch

/version driver-SCHEMA-version-major

/version driver-SCHEMA-version-minor

/version driver-SCHEMA-version-patch

What this PR does / why we need it:

MVP CountMinSketch Powered Probabilistic Counting and Filtering.

Following the principle of working in the open, this PR is intended for development and testing purposes only and aims to gather early feedback.

Bigger Vision for Threat Detection: See Falco Proposal PR falcosecurity/falco#2655.

However, it could also be interesting for any libs adopters and considered a generalization with more options for current mechanisms to suppress tids or comms.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@poiana
Copy link
Contributor

poiana commented Oct 29, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: incertum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@incertum
Copy link
Contributor Author

Stage 1:

As first MVP hard-coded 3 64 bit sketches with gamma = 0.001 and eps = 0.0001 -> (7 * 27183) per sketch -> ~4.5MB total extra static allocation.

Everything else is also hard-coded, such as the value that defines the particular context for one sketch. It is yet to be determined how much one sketch can or should be overloaded.

The CountMinSketch allows for the calculation of frequencies in large data streams within sublinear space, unlike hash tables. It achieves this with the same constant time complexity for operations.

Still, for system calls, the hot path is very active. Performance and trade-offs are yet to be determined. Tried to scout the most performant hashing function (xxh3) already.


Super noisy on a more or less "idle" laptop:

sudo libsinsp/examples/sinsp-example -b driver/bpf/probe.o -f "(evt.type in (execve, execveat, open, openat, openat2) and evt.dir=< and proc.sketch2.count>=0 and fd.sketch0.count>=0)" -j -o "*%evt.time %evt.type %container.id %proc.cmdnargs %proc.cmdline %proc.args %proc.exepath %fd.nameraw %proc.name %proc.pname %proc.tty %proc.sname %proc.vpgid.name %fd.sketch0.count %proc.sketch1.count_avg %proc.sketch2.count" -x

Slows down quickly on the same "idle" laptop and then only occasionally shows new logs:

sudo libsinsp/examples/sinsp-example -b driver/bpf/probe.o -f "(evt.type in (execve, execveat, open, openat, openat2) and evt.dir=< and proc.sketch2.count < 10 and fd.sketch0.count < 3)" -j -o "*%evt.time %evt.type %container.id %proc.cmdnargs %proc.cmdline %proc.args %proc.exepath %fd.nameraw %proc.name %proc.pname %proc.tty %proc.sname %proc.vpgid.name %fd.sketch0.count %proc.sketch1.count_avg %proc.sketch2.count" -x

Signed-off-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
@Andreagit97 Andreagit97 added this to the TBD milestone Nov 6, 2023
@poiana
Copy link
Contributor

poiana commented Feb 12, 2024

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via /~https://github.com/falcosecurity/community.

/lifecycle stale

@incertum
Copy link
Contributor Author

/remove-lifecycle stale

@incertum
Copy link
Contributor Author

The Draft/Demo PR has fulfilled its intended purpose and can now be closed, as I have just opened the new Plugin PR falcosecurity/plugins#419.

@incertum incertum closed this Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants