Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a specific metric for rapl psys domain when available + use it as host_power when available #329

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs_src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,12 @@
- [How scaphandre computes per process power consumption](explanations/how-scaph-computes-per-process-power-consumption.md)
- [Internal structure](explanations/internal-structure.md)
- [About containers](explanations/about-containers.md)
- [About RAPL domains](explanations/rapl-domains.md)

# References

- [Metrics available](references/metrics.md)

## Exporters

- [JSON exporter](references/exporter-json.md)
Expand All @@ -35,6 +38,7 @@
## Sensors

- [PowercapRAPL sensor](references/sensor-powercap_rapl.md)
- [MSRRAPL sensor](references/sensor-msr_rapl.md)

[Why this project ?](why.md)
[Compatibility](compatibility.md)
Expand Down
72 changes: 72 additions & 0 deletions docs_src/explanations/rapl-domains.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Explanation on RAPL domains (what we know so far)

## PSYS

[Kepler documentation](https://sustainable-computing.io/design/metrics/) says PSYS "is the energy consumed by the "System on a chipt" (SOC)."
"Generally, this metric is the host energy consumption (from acpi)." but also "Generally, this metric is the **host energy consumption (from acpi) less the RAPL Package and DRAM**."

[https://zhenkai-zhang.github.io/papers/rapl.pdf](https://zhenkai-zhang.github.io/papers/rapl.pdf) says
Microarchitecture Package CORE (PP0) UNCORE (PP1) DRAM
Haswell Y/Y Y/N Y/N Y/Y
Broadwell Y/Y Y/N Y/N Y/Y
Skylake Y/Y Y/Y Y/N Y/Y
Kaby Lake Y/Y Y/Y Y/N Y/Y


[https://www.arcsi.fr/doc/platypus.pdf](https://www.arcsi.fr/doc/platypus.pdf) says PSYS is "covering the entire SoC.".

http://www.micheledellipaoli.com/documents/EnergyConsumptionAnalysis.pdf says
"PSys: (introduced with Intel Skylake) monitors and con-
trols the thermal and power specifications of the entire
SoC and it is useful especially when the source of the
power consumption is neither the CPU nor the GPU. For
multi-socket server systems, each socket reports its own
RAPL values."

https://hal.science/hal-03809858/document says
"PSys. Domain available on some Intel architectures, to monitor and control the thermal
and power specifications of the entire system on the chip (SoC), instead of just CPU or
GPU. It includes the power consumption of the package domain, System Agent, PCH,
eDRAM, and a few more domains on a single-socket SoC"

![RAPL domains](rapl.png)

/~https://github.com/hubblo-org/scaphandre/issues/116
/~https://github.com/hubblo-org/scaphandre/issues/241
/~https://github.com/hubblo-org/scaphandre/issues/140
/~https://github.com/hubblo-org/scaphandre/issues/289
/~https://github.com/hubblo-org/scaphandre/issues/117
/~https://github.com/hubblo-org/scaphandre/issues/25
/~https://github.com/hubblo-org/scaphandre/issues/316
/~https://github.com/hubblo-org/scaphandre/issues/318

PSYS MSR is "MSR_PLATFORM_ENERGY_STATUS"
https://copyprogramming.com/howto/perf-power-consumption-measure-how-does-it-work

https://pyjoules.readthedocs.io/en/stable/devices/intel_cpu.html

Problems of RAPL on Saphire Rapids
https://community.intel.com/t5/Software-Tuning-Performance/RAPL-quirks-on-Sapphire-Rapids/td-p/1446761

Misc info on RAPL
https://web.eece.maine.edu/~vweaver/projects/rapl/

PSYS MSR have a different layout than PKG and dram
https://patchwork.kernel.org/project/linux-pm/patch/20211207131734.2607104-1-rui.zhang@intel.com/

https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/010/power-management/ ==> intel doc avout thermal and power management
https://edc.intel.com/content/www/us/en/design/ipla/software-development-platforms/client/platforms/alder-lake-desktop/12th-generation-intel-core-processors-datasheet-volume-1-of-2/002/platform-power-control/ ==> about psys

https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html ==> intel software developer manual

CVE-8694/8695 and mitigation by intel
https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/running-average-power-limit-energy-reporting.html

Patch in the kernel
https://groups.google.com/g/linux.kernel/c/x_7RbqcrxAs
Patch in powercap
https://lkml.iu.edu/hypermail/linux/kernel/1603.2/02415.html
https://lkml.kernel.org/lkml/1460930581-29748-1-git-send-email-srinivas.pandruvada@linux.intel.com/T/

Random
https://stackoverflow.com/questions/55956287/perf-power-consumption-measure-how-does-it-work
Binary file added docs_src/explanations/rapl.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
46 changes: 29 additions & 17 deletions docs_src/references/exporter-json.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,22 +26,34 @@ To get informations about processes that are running in containers, add `--conta

scaphandre --no-header json --containers --max-top-consumers=15 | jq

As always exporter's options can be displayed with `-h`:

$ scaphandre json -h
JSON exporter allows you to output the power consumption data in a json file
Since 1.0.0 you can filter the processes, either by their process name with `--process-regex`, or by the name of the container they run in with `--container-regex` (needs the flag `--containers` to be active as well).

USAGE:
scaphandre json [FLAGS] [OPTIONS]

FLAGS:
--containers Monitor and apply labels for processes running as containers
-h, --help Prints help information
-V, --version Prints version information
As always exporter's options can be displayed with `-h`:

OPTIONS:
-f, --file <file_path> Destination file for the report. [default: ]
-m, --max-top-consumers <max_top_consumers> Maximum number of processes to watch. [default: 10]
-s, --step <step_duration> Set measurement step duration in second. [default: 2]
-n, --step_nano <step_duration_nano> Set measurement step duration in nano second. [default: 0]
-t, --timeout <timeout> Maximum time spent measuring, in seconds.
Write the metrics in the JSON format to a file or to stdout

Usage: scaphandre json [OPTIONS]

Options:
-t, --timeout <TIMEOUT>
Maximum time spent measuring, in seconds. If unspecified, runs forever
-s, --step <SECONDS>
Interval between two measurements, in seconds [default: 2]
--step-nano <NANOSECS>
Additional step duration in _nano_ seconds. This is added to `step` to get the final duration [default: 0]
--max-top-consumers <MAX_TOP_CONSUMERS>
Maximum number of processes to watch [default: 10]
-f, --file <FILE>
Destination file for the report (if absent, print the report to stdout)
--containers
Monitor and apply labels for processes running as containers
--process-regex <PROCESS_REGEX>
Filter processes based on regular expressions (example: 'scaph\\w\\w.e')
--container-regex <CONTAINER_REGEX>
Filter containers based on regular expressions
--resources
Monitor and incude CPU, RAM and Disk usage per process
-h, --help
Print help

Metrics provided Scaphandre are documented [here](references/metrics.md).
77 changes: 1 addition & 76 deletions docs_src/references/exporter-prometheus.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,79 +34,4 @@ With default options values, the metrics are exposed on http://localhost:8080/me
Use -q or --qemu option if you are running scaphandre on a hypervisor. In that case a label with the vm name will be added to all `qemu-system*` processes.
This will allow to easily create charts consumption for each vm and defined which one is the top contributor.

## Metrics exposed

All metrics have a HELP section provided on /metrics (or whatever suffix you choosed to expose them).

Here are some key metrics that you will most probably be interested in:

- `scaph_host_power_microwatts`: Power measurement on the whole host, in microwatts (GAUGE)
- `scaph_process_power_consumption_microwatts{exe="$PROCESS_EXE",pid="$PROCESS_PID",cmdline="path/to/exe --and-maybe-options"}`: Power consumption due to the process, measured on at the topology level, in microwatts. PROCESS_EXE being the name of the executable and PROCESS_PID being the pid of the process. (GAUGE)

For more details on that metric labels, see [this section](#scaph_process_power_consumption_microwatts).

And some more deep metrics that you may want if you need to make more complex calculations and data processing:

- `scaph_host_energy_microjoules` : Energy measurement for the whole host, as extracted from the sensor, in microjoules. (COUNTER)
- `scaph_socket_power_microwatts{socket_id="$SOCKET_ID"}`: Power measurement relative to a CPU socket, in microwatts. SOCKET_ID being the socket numerical id (GAUGE)

If you hack scaph or just want to investigate its behavior, you may be interested in some internal metrics:

- `scaph_self_memory_bytes`: Scaphandre memory usage, in bytes

- `scaph_self_memory_virtual_bytes`: Scaphandre virtual memory usage, in bytes

- `scaph_self_topo_stats_nb`: Number of CPUStat traces stored for the host

- `scaph_self_topo_records_nb`: Number of energy consumption Records stored for the host

- `scaph_self_topo_procs_nb`: Number of processes monitored by scaph

- `scaph_self_socket_stats_nb{socket_id="SOCKET_ID"}`: Number of CPUStat traces stored for each socket

- `scaph_self_socket_records_nb{socket_id="SOCKET_ID"}`: Number of energy consumption Records stored for each socket, with SOCKET_ID being the id of the socket measured

- `scaph_self_domain_records_nb{socket_id="SOCKET_ID",rapl_domain_name="RAPL_DOMAIN_NAME
"}`: Number of energy consumption Records stored for a Domain, where SOCKET_ID identifies the socket and RAPL_DOMAIN_NAME identifies the rapl domain measured on that socket

### scaph_process_power_consumption_microwatts

Here are available labels for the `scaph_process_power_consumption_microwatts` metric that you may need to extract the data you need:

- `exe`: is the name of the executable that is the origin of that process. This is good to be used when your application is running one or only a few processes.
- `cmdline`: this contains the whole command line with the executable path and its parameters (concatenated). You can filter on this label by using prometheus `=~` operator to match a regular expression pattern. This is very practical in many situations.
- `instance`: this is a prometheus generated label to enable you to filter the metrics by the originating host. This is very useful when you monitor distributed services, so that you can not only sum the metrics for the same service on the different hosts but also see what instance of that service is consuming the most, or notice differences beteween hosts that may not have the same hardware, and so on...
- `pid`: is the process id, which is useful if you want to track a specific process and have your eyes on what's happening on the host, but not so practical to use in a more general use case

### Get container-specific labels on scaph_process_power_consumption_microwatts metrics

The flag --containers enables Scaphandre to collect data about the running Docker containers or Kubernetes pods on the local machine. This way, it adds specific labels to make filtering processes power consumption metrics by their encapsulation in containers easier.

Generic labels help to identify the container runtime and scheduler used (based on the content of `/proc/PID/cgroup`):

`container_scheduler`: possible values are `docker` or `kubernetes`. If this label is not attached to the metric, it means that scaphandre didn't manage to identify the container scheduler based on cgroups data.

Then the label `container_runtime` could be attached. The only possible value for now is `containerd`.

`container_id` is the ID scaphandre got from /proc/PID/cgroup for that container.

For Docker containers (if `container_scheduler` is set), available labels are :

- `container_names`: is a string containing names attached to that container, according to the docker daemon
- `container_docker_version`: version of the docker daemon
- `container_label_maintainer`: content of the maintainer field for this container

For containers coming from a docker-compose file, there are a bunch of labels related to data coming from the docker daemon:

- `container_label_com_docker_compose_project_working_dir`
- `container_label_com_docker_compose_container_number`
- `container_label_com_docker_compose_project_config_files`
- `container_label_com_docker_compose_version`
- `container_label_com_docker_compose_service`
- `container_label_com_docker_compose_oneoff`

For Kubernetes pods (if `container_scheduler` is set), available labels are :

- `kubernetes_node_name`: identifies the name of the kubernetes node scaphandre is running on
- `kubernetes_pod_name`: the name of the pod the container belongs to
- `kubernetes_pod_namespace`: the namespace of the pod the container belongs to
Metrics provided Scaphandre are documented [here](references/metrics.md).
50 changes: 29 additions & 21 deletions docs_src/references/exporter-riemann.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,28 +9,35 @@ You can launch the Riemann exporter this way (running the default powercap_rapl
scaphandre riemann

As always exporter's options can be displayed with `-h`:

```
scaphandre-riemann
Riemann exporter sends power consumption metrics to a Riemann server

USAGE:
scaphandre riemann [FLAGS] [OPTIONS]

FLAGS:
-h, --help Prints help information
--mtls Connect to a Riemann server using mTLS. Parameters address, ca, cert and key must be defined.
-q, --qemu Instruct that scaphandre is running on an hypervisor
-V, --version Prints version information

OPTIONS:
-a, --address <address> Riemann ipv6 or ipv4 address. If mTLS is used then server fqdn must be
provided [default: localhost]
-d, --dispatch <dispatch_duration> Duration between metrics dispatch [default: 5]
-p, --port <port> Riemann TCP port number [default: 5555]
--ca <cafile> CA certificate file (.pem format)
--cert <certfile> Client certificate file (.pem format)
--key <keyfile> Client RSA key
Expose the metrics to a Riemann server

Usage: scaphandre riemann [OPTIONS]

Options:
-a, --address <ADDRESS>
Address of the Riemann server. If mTLS is used this must be the server's FQDN [default: localhost]
-p, --port <PORT>
TCP port number of the Riemann server [default: 5555]
-d, --dispatch-interval <DISPATCH_INTERVAL>
Duration between each metric dispatch, in seconds [default: 5]
-q, --qemu
Apply labels to metrics of processes looking like a Qemu/KVM virtual machine
--containers
Monitor and apply labels for processes running as containers
--mtls
Connect to Riemann using mTLS instead of plain TCP
--ca <CA_FILE>
CA certificate file (.pem format)
--cert <CERT_FILE>
Client certificate file (.pem format)
--key <KEY_FILE>
Client RSA key file
-h, --help
Print help
```

With default options values, the metrics are sent to http://localhost:5555 every 5 seconds

Use `--mtls` option to connect to a Riemann server using mTLS. In such case, you must provide the following parameters:
Expand Down Expand Up @@ -79,7 +86,8 @@ As a reference here is a Riemann configuration:
```

## Metrics exposed
Typically the Riemann exporter is working in the same way as the prometheus exporter regarding metrics. Please look at details in [Prometheus exporter](exporter-prometheus.md) documentations.

Metrics provided Scaphandre are documented [here](references/metrics.md).

There is only one exception about `process_power_consumption_microwatts` each process has a service name `process_power_consumption_microwatts_pid_exe`.

Expand Down
32 changes: 14 additions & 18 deletions docs_src/references/exporter-stdout.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,26 +30,22 @@ Here is how to display power data for the 'scaphandre' process:

scaphandre stdout -r 'scaphandre'

Note
Metrics provided Scaphandre are documented [here](references/metrics.md).

As always exporter's options can be displayed with `-h`:

$ scaphandre stdout -h
scaphandre-stdout
Stdout exporter allows you to output the power consumption data in the terminal
Since 1.0.0 the flag `--raw-metrics` displays all metrics available for the host, as a parseable list. This might be useful to list metrics that you would like to fetch afterwards in your monitoring dashboard. Without this flag enabled, Stdout exporter has it's own format and might not show you all available metrics.

USAGE:
scaphandre stdout [OPTIONS]
As always exporter's options can be displayed with `-h`:

FLAGS:
-h, --help Prints help information
-V, --version Prints version information
Write the metrics to the terminal

OPTIONS:
-p, --process <process_number> Number of processes to display. [default: 5]
-r, --regex <regex_filter> Filter processes based on regular expressions (e.g: 'scaph\w\wd.e'). This option
disable '-p' or '--process' one.
-s, --step <step_duration> Set measurement step duration in seconds. [default: 2]
-t, --timeout <timeout> Maximum time spent measuring, in seconds. 0 means continuous measurement.
[default: 10]
Usage: scaphandre stdout [OPTIONS]

Options:
-t, --timeout <TIMEOUT> Maximum time spent measuring, in seconds. If negative, runs forever [default: 10]
-s, --step <SECONDS> Interval between two measurements, in seconds [default: 2]
-p, --processes <PROCESSES> Maximum number of processes to display [default: 5]
-r, --regex-filter <REGEX_FILTER> Filter processes based on regular expressions (example: 'scaph\\w\\w.e')
--containers Monitor and apply labels for processes running as containers
-q, --qemu Apply labels to metrics of processes looking like a Qemu/KVM virtual machine
--raw-metrics Display metrics with their names
-h, --help Print help
Loading