Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new(driver): add 2 new scap stats #1303

Merged
merged 5 commits into from
Aug 23, 2023

Conversation

Andreagit97
Copy link
Member

What type of PR is this?

/kind feature

Any specific area of the project related to this PR?

/area API-version

/area driver-kmod

/area driver-bpf

/area driver-modern-bpf

/area libscap-engine-bpf

/area libscap-engine-kmod

/area libscap-engine-modern-bpf

/area libscap

/area libpman

/area libsinsp

Does this PR require a change in the driver versions?

/version driver-API-version-major

What this PR does / why we need it:

This PR adds 2 new kernel stats:

  • n_drops_buffer_close_exit: CLOSE exit event drops.
  • n_drops_buffer_proc_exit: SCHED_PROC_EXIT event drops.

They could be very useful when debugging possible mem-leak or high memory usage since the PROC_EXIT event is strictly related to the number of threads in our table while the CLOSE_X is related to the number of file descriptors.

For the same debugging purpose I added some logs when the capture is closed to print the dimension of the thread table and the total number of fd still opened.

Which issue(s) this PR fixes:

Special notes for your reviewer:

This PR requires an API major bump since the stat struct sent by the drivers is changed

Does this PR introduce a user-facing change?:

new(driver): add 2 new scap stats

- n_drops_buffer_close_exit: CLOSE exit event drops.
- n_drops_buffer_proc_exit: SCHED_PROC_EXIT event drops.

Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
Comment on lines +1964 to +1970
"\nn_evts:%" PRIu64
"\nn_drops:%" PRIu64
"\nn_drops_buffer:%" PRIu64
"\nn_drops_buffer_clone_fork_enter:%" PRIu64
"\nn_drops_buffer_clone_fork_exit:%" PRIu64
"\nn_drops_buffer_execve_enter:%" PRIu64
"\nn_drops_buffer_execve_exit:%" PRIu64
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've slightly changed the way in which we print the log. Now we don't have anymore a single line but we put a \n after each stat. This is an output example

08-21 11:32:31.914754 
n_evts:211937
n_drops:0
n_drops_buffer:0
n_drops_buffer_clone_fork_enter:0
n_drops_buffer_clone_fork_exit:0
n_drops_buffer_execve_enter:0
n_drops_buffer_execve_exit:0
n_drops_buffer_connect_enter:0
n_drops_buffer_connect_exit:0
n_drops_buffer_open_enter:0
n_drops_buffer_open_exit:0
n_drops_buffer_dir_file_enter:0
n_drops_buffer_dir_file_exit:0
n_drops_buffer_other_interest_enter:0
n_drops_buffer_other_interest_exit:0
n_drops_buffer_close_exit:0
n_drops_buffer_proc_exit:0
n_drops_scratch_map:0
n_drops_pf:0
n_drops_bug:0

08-21 11:32:31.914981 total threads in the table:1522, total fds in all threads:5319

So as you can see the output is quite small but if we should face issues in the future we could always revert it on one line, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used for debugging aka no problems at all would say

Copy link
Contributor

@incertum incertum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andreagit97 great idea, love it and you know I love any useful stats always 😄

On Falco side the new metrics will pick up these new stats automatically, but the old drop internal messages / stats would need an update PR as well.

Just left a small ask for some unrelated cleanup to consider while you are here. Can then approve tomorrow!

Comment on lines +1964 to +1970
"\nn_evts:%" PRIu64
"\nn_drops:%" PRIu64
"\nn_drops_buffer:%" PRIu64
"\nn_drops_buffer_clone_fork_enter:%" PRIu64
"\nn_drops_buffer_clone_fork_exit:%" PRIu64
"\nn_drops_buffer_execve_enter:%" PRIu64
"\nn_drops_buffer_execve_exit:%" PRIu64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used for debugging aka no problems at all would say

@@ -190,6 +190,12 @@ FILLER_RAW(terminate_filler)
++state->n_drops_buffer_other_interest_enter;
}
break;
case PPME_PROCEXIT_E:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andreagit97 while we are here, I think it was during the loginuid PR where I saw we could cleanup and remove some of the old PPME_SYSCALL_CLONE_* events. Could we sneak this cleanup in?

Also do you think the categories n_drops_buffer_dir_file* and n_drops_buffer_other_interest* could benefit from some updates while here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)

Also do you think the categories n_drops_buffer_dir_file* and n_drops_buffer_other_interest* could benefit from some updates while here?

During my experiments I've never used other stats so at the moment I would say no 🤔

@leogr
Copy link
Member

leogr commented Aug 23, 2023

/milestone 0.13.0

@poiana poiana modified the milestones: next-driver, 0.13.0 Aug 23, 2023
Andreagit97 and others added 2 commits August 23, 2023 16:10
Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
Co-authored-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
Co-authored-by: Melissa Kilby <melissa.kilby.oss@gmail.com>
@poiana poiana added size/XL and removed size/L labels Aug 23, 2023
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup :D

Copy link
Contributor

@FedeDP FedeDP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Aug 23, 2023

LGTM label has been added.

Git tree hash: 5dcbda3dbbfb5e3ea5bf2471a97686564854c766

Copy link
Contributor

@incertum incertum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@poiana
Copy link
Contributor

poiana commented Aug 23, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Andreagit97, FedeDP, incertum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [Andreagit97,FedeDP,incertum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants