Skip to content

Commit

Permalink
Perfspect updates April, 07, 2023 (#25)
Browse files Browse the repository at this point in the history
  • Loading branch information
ashrafMahgoub authored Apr 7, 2023
1 parent 3d5ab2d commit 1dfc3f9
Show file tree
Hide file tree
Showing 13 changed files with 976 additions and 1,722 deletions.
113 changes: 21 additions & 92 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,27 @@
# PerfSpect · [![Build](/~https://github.com/intel/PerfSpect/actions/workflows/build.yml/badge.svg)](/~https://github.com/intel/PerfSpect/actions/workflows/build.yml)[![License](https://img.shields.io/badge/License-BSD--3-blue)](/~https://github.com/intel/PerfSpect/blob/master/LICENSE)

[Quick Start](#quick-start-requires-perf-installed) | [Requirements](#requirements) | [Build from source](#build-from-source) | [Collection](#collection) | [Post-processing](#post-processing) | [Caveats](#caveats) | [How to contribute](#how-to-contribute)
[Quick Start](#quick-start-requires-perf-installed) | [Requirements](#requirements) | [Build from source](#build-from-source) | [Caveats](#caveats) | [How to contribute](#how-to-contribute)

PerfSpect is a system performance characterization tool based on linux perf targeting Intel microarchitectures.
The tool has two parts
PerfSpect is a system performance characterization tool built on top of linux perf. It contains two parts

1. perf collection to collect underlying PMU (Performance Monitoring Unit) counters
2. post processing that generates csv output of performance metrics.
perf-collect: Collects harware events

### Quick start (requires perf installed)
- Collection mode:
- `sudo ./perf-collect` _default system wide_
- `sudo ./perf-collect --socket`
- `sudo ./perf-collect --thread`
- `sudo ./perf-collect --pid <process-id>`
- `sudo ./perf-collect --cid <container-id1>;<container-id2>`
- Duration:
- `sudo ./perf-collect` _default run until terminated_
- `sudo ./perf-collect --timeout 10` _run for 10 seconds_
- `sudo ./perf-collect --app "myapp.sh myparameter"` _runs for duration of another process_

perf-postprocess: Calculates high level metrics from hardware events

- `perf-postprocess -r results/perfstat.csv`

## Quick start (requires perf installed)

```
wget -qO- /~https://github.com/intel/PerfSpect/releases/latest/download/perfspect.tgz | tar xvz
Expand All @@ -17,7 +30,7 @@ sudo ./perf-collect --timeout 10
sudo ./perf-postprocess -r results/perfstat.csv --html perfstat.html
```

### Deploy in Kubernetes
## Deploy in Kubernetes

Modify the template [deamonset.yml](docs/daemonset.yml) to deploy in kubernetes

Expand Down Expand Up @@ -52,97 +65,13 @@ _Note: PerfSpect may work on other Linux distributions, but has not been thoroug

## Build from source

Requires recent python
Requires recent python. On successful build, binaries will be created in `dist` folder

```
pip3 install -r requirements.txt
make
```

On successful build, binaries will be created in `dist` folder

## Collection:

```
(sudo) ./perf-collect (options) -- Some options can be used only with root privileges
usage: perf-collect [-h] [-t TIMEOUT | -a APP]
[-p PID | -c CID | --thread | --socket] [-V] [-i INTERVAL]
[-m MUXINTERVAL] [-o OUTCSV] [-v]
optional arguments:
-h, --help show this help message and exit
-t TIMEOUT, --timeout TIMEOUT
perf event collection time
-a APP, --app APP Application to run with perf-collect, perf collection
ends after workload completion
-p PID, --pid PID perf-collect on selected PID(s)
-c CID, --cid CID perf-collect on selected container ids
--thread Collect for thread metrics
--socket Collect for socket metrics
-V, --version display version info
-i INTERVAL, --interval INTERVAL
interval in seconds for time series dump, default=1
-m MUXINTERVAL, --muxinterval MUXINTERVAL
event mux interval in milli seconds, default=0 i.e.
will use the system default
-o OUTCSV, --outcsv OUTCSV
perf stat output in csv format,
default=results/perfstat.csv
-v, --verbose Display debugging information
```

### Examples

1. sudo ./perf-collect (collect PMU counters using predefined architecture specific event file until collection is terminated)
2. sudo ./perf-collect -a "myapp.sh myparameter" (collect perf for myapp.sh)
3. sudo ./perf-collect --cid "one or more container IDs from docker or kubernetes seperated by semicolon"

## Post-processing:

```
./perf-postprocess (options)
usage: perf-postprocess [-h] [--version] [-m METRICFILE] [-o OUTFILE]
[--persocket] [--percore] [-v] [--epoch] [-html HTML]
[-r RAWFILE]
perf-postprocess: perf post process
optional arguments:
-h, --help show this help message and exit
--version, -V display version information
-m METRICFILE, --metricfile METRICFILE
formula file, default metric file for the architecture
-o OUTFILE, --outfile OUTFILE
perf stat outputs in csv format,
default=results/metric_out.csv
--persocket generate per socket metrics
--percore generate per core metrics
-v, --verbose include debugging information, keeps all intermediate
csv files
--epoch time series in epoch format, default is sample count
-html HTML, --html HTML
Static HTML report
required arguments:
-r RAWFILE, --rawfile RAWFILE
Raw CSV output from perf-collect
```

### Examples

./perf-postprocess -r results/perfstat.csv (post processes perfstat.csv and creates metric_out.csv, metric_out.average.csv, metric_out.raw.csv)

./perf-postprocess -r results/perfstat.csv --html perfstat.html (creates a report for TMA analysis and system level metric charts.)

### Notes

1. metric_out.csv : Time series dump of the metrics. The metrics are defined in events/metric.json
2. metric_out.averags.csv: Average of metrics over the collection period
3. metric_out.raw.csv: csv file with raw events normalized per second
4. Socket/core level metrics: Additonal csv files outputfile.socket.csv/outputfile.core.csv will be generated.

## Caveats

1. The tool can collect only the counters supported by underlying linux perf version.
Expand Down
2 changes: 1 addition & 1 deletion _version.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.2.5
1.2.6
7 changes: 7 additions & 0 deletions events/icx.txt
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,13 @@ upi/event=0x2,umask=0xf,name='UNC_UPI_TxL_FLITS.ALL_DATA'/,
upi/event=0x2,umask=0x97,name='UNC_UPI_TxL_FLITS.NON_DATA'/,
upi/event=0x1,umask=0x0,name='UNC_UPI_CLOCKTICKS'/;

cha/event=0x35,umask=0xc88ffe01,name='UNC_CHA_TOR_INSERTS.IA_MISS_CRD_PREF'/,
cha/event=0x35,umask=0xc80ffe01,name='UNC_CHA_TOR_INSERTS.IA_MISS_CRD'/;

cha/event=0x35,umask=0xc897fe01,name='UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PREF'/,
cha/event=0x35,umask=0xc817fe01,name='UNC_CHA_TOR_INSERTS.IA_MISS_DRD'/,
cha/event=0x35,umask=0xccd7fe01,name='UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFDATA'/;

cha/event=0x35,umask=0xC816FE01,name='UNC_CHA_TOR_INSERTS.IA_MISS_DRD_LOCAL'/,
cha/event=0x35,umask=0xC8177E01,name='UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE'/,
cha/event=0x35,umask=0xC896FE01,name='UNC_CHA_TOR_INSERTS.IA_MISS_DRD_PREF_LOCAL'/,
Expand Down
78 changes: 39 additions & 39 deletions events/metric_bdx.json
Original file line number Diff line number Diff line change
Expand Up @@ -173,159 +173,159 @@
"expression": "([UNC_C_TOR_INSERTS.OPCODE.0x1c8] + [UNC_C_TOR_INSERTS.OPCODE.0x180]) * 64 / 1000000"
},
{
"name": "metric_TMAM_Info_cycles_both_threads_active(%)",
"name": "metric_TMA_Info_cycles_both_threads_active(%)",
"expression": "100 * ( (1 - ([CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE] / ([CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY] / 2)) ) if [const_thread_count] > 1 else 0)"
},
{
"name": "metric_TMAM_Info_CoreIPC",
"name": "metric_TMA_Info_CoreIPC",
"expression": "[instructions] / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_Frontend_Bound(%)",
"name": "metric_TMA_Frontend_Bound(%)",
"expression": "100 * [IDQ_UOPS_NOT_DELIVERED.CORE] / (4 * ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))"
},
{
"name": "metric_TMAM_..Frontend_Latency(%)",
"name": "metric_TMA_..Frontend_Latency(%)",
"expression": "100 * [IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE] / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_....ICache_Misses(%)",
"name": "metric_TMA_....ICache_Misses(%)",
"expression": "100 * [ICACHE.IFDATA_STALL] / [cpu-cycles]"
},
{
"name": "metric_TMAM_....ITLB_Misses(%)",
"name": "metric_TMA_....ITLB_Misses(%)",
"expression": "100 * ((14 * [ITLB_MISSES.STLB_HIT]) + [ITLB_MISSES.WALK_DURATION_c1] + (7 * [ITLB_MISSES.WALK_COMPLETED] )) / [cpu-cycles]"
},
{
"name": "metric_TMAM_....Branch_Resteers(%)",
"name": "metric_TMA_....Branch_Resteers(%)",
"expression": "100 * (([RS_EVENTS.EMPTY_CYCLES] - [ICACHE.IFDATA_STALL] - (14 * [ITLB_MISSES.STLB_HIT] + [ITLB_MISSES.WALK_DURATION_c1] + 7 * [ITLB_MISSES.WALK_COMPLETED])) / [RS_EVENTS.EMPTY_END]) * ([BR_MISP_RETIRED.ALL_BRANCHES] + [MACHINE_CLEARS.COUNT] + [BACLEARS.ANY]) / [cpu-cycles]"
},
{
"name": "metric_TMAM_....DSB_Switches(%)",
"name": "metric_TMA_....DSB_Switches(%)",
"expression": "100 * 2 * [DSB2MITE_SWITCHES.PENALTY_CYCLES] / [cpu-cycles]"
},
{
"name": "metric_TMAM_....MS_Switches(%)",
"name": "metric_TMA_....MS_Switches(%)",
"expression": "100 * 2 * [IDQ.MS_SWITCHES] / [cpu-cycles]"
},
{
"name": "metric_TMAM_..Frontend_Bandwidth(%)",
"name": "metric_TMA_..Frontend_Bandwidth(%)",
"expression": "100 * ([IDQ_UOPS_NOT_DELIVERED.CORE] - (4 * [IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE])) / (4 * [CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_Bad_Speculation(%)",
"name": "metric_TMA_Bad_Speculation(%)",
"expression": "100 * ([UOPS_ISSUED.ANY] - [UOPS_RETIRED.RETIRE_SLOTS] + ((4 * [INT_MISC.RECOVERY_CYCLES_ANY]) / [const_thread_count])) / (4 * ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])) "
},
{
"name": "metric_TMAM_..Branch_Mispredicts(%)",
"name": "metric_TMA_..Branch_Mispredicts(%)",
"expression": "([BR_MISP_RETIRED.ALL_BRANCHES] / ([BR_MISP_RETIRED.ALL_BRANCHES] + [MACHINE_CLEARS.COUNT])) * 100 * ([UOPS_ISSUED.ANY] - [UOPS_RETIRED.RETIRE_SLOTS] + (4 * [INT_MISC.RECOVERY_CYCLES_ANY] / [const_thread_count])) / (4 * [CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_..Machine_Clears(%)",
"name": "metric_TMA_..Machine_Clears(%)",
"expression": "([MACHINE_CLEARS.COUNT] / ([BR_MISP_RETIRED.ALL_BRANCHES] + [MACHINE_CLEARS.COUNT])) * 100 * ([UOPS_ISSUED.ANY] - [UOPS_RETIRED.RETIRE_SLOTS] + (4 * [INT_MISC.RECOVERY_CYCLES_ANY] / [const_thread_count])) / (4 * [CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_Backend_bound(%)",
"name": "metric_TMA_Backend_Bound(%)",
"expression": "100 - (100 * ([UOPS_ISSUED.ANY] - [UOPS_RETIRED.RETIRE_SLOTS] + 4 * ([INT_MISC.RECOVERY_CYCLES_ANY] / [const_thread_count]) + [IDQ_UOPS_NOT_DELIVERED.CORE] + [UOPS_RETIRED.RETIRE_SLOTS]) / (4 * [CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])) "
},
{
"name": "metric_TMAM_..Memory_Bound(%)",
"name": "metric_TMA_..Memory_Bound(%)",
"expression": "100 * (1 - (([UOPS_ISSUED.ANY] - [UOPS_RETIRED.RETIRE_SLOTS] + 4 * ([INT_MISC.RECOVERY_CYCLES_ANY] / [const_thread_count]) + [IDQ_UOPS_NOT_DELIVERED.CORE] + [UOPS_RETIRED.RETIRE_SLOTS]) / (4 * [CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))) * ([CYCLE_ACTIVITY.STALLS_MEM_ANY] + [RESOURCE_STALLS.SB]) / ([CYCLE_ACTIVITY.STALLS_TOTAL] + [UOPS_EXECUTED.CYCLES_GE_1_UOPS_EXEC] - ( [UOPS_EXECUTED.CYCLES_GE_3_UOPS_EXEC] if ([instructions] / [cpu-cycles]) > 1.8 else [UOPS_EXECUTED.CYCLES_GE_2_UOPS_EXEC]) - ( [RS_EVENTS.EMPTY_CYCLES] if ([IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE] / [CPU_CLK_UNHALTED.THREAD_ANY]) > 0.1 else 0) + [RESOURCE_STALLS.SB])"
},
{
"name": "metric_TMAM_....L1_Bound(%)",
"name": "metric_TMA_....L1_Bound(%)",
"expression": "100 * ([CYCLE_ACTIVITY.STALLS_MEM_ANY] - [CYCLE_ACTIVITY.STALLS_L1D_MISS]) / [cpu-cycles]"
},
{
"name": "metric_TMAM_......DTLB_Load(%)",
"name": "metric_TMA_......DTLB_Load(%)",
"expression": "100 * ([DTLB_LOAD_MISSES.STLB_HIT] * 8 + [DTLB_LOAD_MISSES.WALK_DURATION_c1] + 7 * [DTLB_LOAD_MISSES.WALK_COMPLETED]) / [cpu-cycles]"
},
{
"name": "metric_TMAM_......Store_Fwd_Blk(%)",
"name": "metric_TMA_......Store_Fwd_Blk(%)",
"expression": "100 * (13 * [LD_BLOCKS.STORE_FORWARD]) / [cpu-cycles]"
},
{
"name": "metric_TMAM_....L2_Bound(%)",
"name": "metric_TMA_....L2_Bound(%)",
"expression": "100 * ([CYCLE_ACTIVITY.STALLS_L1D_MISS] - [CYCLE_ACTIVITY.STALLS_L2_MISS]) / [cpu-cycles]"
},
{
"name": "metric_TMAM_....L3_Bound(%)",
"name": "metric_TMA_....L3_Bound(%)",
"expression": "100 * [MEM_LOAD_UOPS_RETIRED.L3_HIT] / ([MEM_LOAD_UOPS_RETIRED.L3_HIT] + 7 * [MEM_LOAD_UOPS_RETIRED.L3_MISS]) * ([CYCLE_ACTIVITY.STALLS_L2_MISS] / [cpu-cycles])"
},
{
"name": "metric_TMAM_......L3_Latency(%)",
"name": "metric_TMA_......L3_Latency(%)",
"expression": "100 * 41 * [MEM_LOAD_UOPS_RETIRED.L3_HIT] * ( 1 + [MEM_LOAD_UOPS_RETIRED.HIT_LFB] / ( [MEM_LOAD_UOPS_RETIRED.L2_HIT] + [MEM_LOAD_UOPS_RETIRED.L3_HIT] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_HITM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_FWD] ) ) / [cpu-cycles] "
},
{
"name": "metric_TMAM_......Contested_Accesses(%)",
"name": "metric_TMA_......Contested_Accesses(%)",
"expression": "100 * 60 * ([MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS]) * ( 1 + [MEM_LOAD_UOPS_RETIRED.HIT_LFB] / ( [MEM_LOAD_UOPS_RETIRED.L2_HIT] + [MEM_LOAD_UOPS_RETIRED.L3_HIT] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_HITM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_FWD] ) ) / [cpu-cycles] "
},
{
"name": "metric_TMAM_......Data_Sharing(%)",
"name": "metric_TMA_......Data_Sharing(%)",
"expression": "100 * 43 * [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT] * ( 1 + [MEM_LOAD_UOPS_RETIRED.HIT_LFB] / ( [MEM_LOAD_UOPS_RETIRED.L2_HIT] + [MEM_LOAD_UOPS_RETIRED.L3_HIT] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM] + [MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_HITM] + [MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_FWD] ) ) / [cpu-cycles] "
},
{
"name": "metric_TMAM_......SQ_Full(%)",
"name": "metric_TMA_......SQ_Full(%)",
"expression": "100 * ([OFFCORE_REQUESTS_BUFFER.SQ_FULL] / [const_thread_count]) / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_....MEM_Bound(%)",
"name": "metric_TMA_....MEM_Bound(%)",
"expression": "100 * (1 - ( [MEM_LOAD_UOPS_RETIRED.L3_HIT] / ([MEM_LOAD_UOPS_RETIRED.L3_HIT] + 7 * [MEM_LOAD_UOPS_RETIRED.L3_MISS])) ) * ([CYCLE_ACTIVITY.STALLS_L2_MISS] / [cpu-cycles])"
},
{
"name": "metric_TMAM_......MEM_Bandwidth(%)",
"name": "metric_TMA_......MEM_Bandwidth(%)",
"expression": "100 * (min([OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD_c4], [cpu-cycles])) / [cpu-cycles]"
},
{
"name": "metric_TMAM_......MEM_Latency(%)",
"name": "metric_TMA_......MEM_Latency(%)",
"expression": "100 * (min([OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD], [cpu-cycles]) - min([OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD_c4], [cpu-cycles])) / [cpu-cycles]"
},
{
"name": "metric_TMAM_....Stores_Bound(%)",
"name": "metric_TMA_....Store_Bound(%)",
"expression": "100 * [RESOURCE_STALLS.SB] / [cpu-cycles]"
},
{
"name": "metric_TMAM_......DTLB_Store(%)",
"name": "metric_TMA_......DTLB_Store(%)",
"expression": "100 * (7 * [DTLB_STORE_MISSES.STLB_HIT] + [DTLB_STORE_MISSES.WALK_DURATION_c1]) / [cpu-cycles]"
},
{
"name": "metric_TMAM_..Core_Bound(%)",
"name": "metric_TMA_..Core_Bound(%)",
"expression": "100 * ( 1 - (( [UOPS_ISSUED.ANY] - [UOPS_RETIRED.RETIRE_SLOTS] + 4 * ([INT_MISC.RECOVERY_CYCLES_ANY] / [const_thread_count]) + [IDQ_UOPS_NOT_DELIVERED.CORE] + [UOPS_RETIRED.RETIRE_SLOTS] ) / ( 4 * [CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))) * (1 - (([CYCLE_ACTIVITY.STALLS_MEM_ANY] + [RESOURCE_STALLS.SB]) / ([CYCLE_ACTIVITY.STALLS_TOTAL] + [UOPS_EXECUTED.CYCLES_GE_1_UOPS_EXEC] - ( [UOPS_EXECUTED.CYCLES_GE_3_UOPS_EXEC] if ([instructions] / [cpu-cycles]) > 1.8 else [UOPS_EXECUTED.CYCLES_GE_2_UOPS_EXEC]) - ([RS_EVENTS.EMPTY_CYCLES] if ([IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE] / [CPU_CLK_UNHALTED.THREAD_ANY]) > 0.1 else 0) + [RESOURCE_STALLS.SB])))"
},
{
"name": "metric_TMAM_....Divider(%)",
"name": "metric_TMA_....Divider(%)",
"expression": "100 * [ARITH.FPU_DIV_ACTIVE] / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_....Ports_Utilization(%)",
"name": "metric_TMA_....Ports_Utilization(%)",
"expression": "100 * (( [CYCLE_ACTIVITY.STALLS_TOTAL] + [UOPS_EXECUTED.CYCLES_GE_1_UOPS_EXEC] - ([UOPS_EXECUTED.CYCLES_GE_3_UOPS_EXEC] if ([instructions] / [cpu-cycles]) > 1.8 else [UOPS_EXECUTED.CYCLES_GE_2_UOPS_EXEC]) - ([RS_EVENTS.EMPTY_CYCLES] if ([IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE] / [CPU_CLK_UNHALTED.THREAD_ANY]) > 0.1 else 0) + [RESOURCE_STALLS.SB]) - [RESOURCE_STALLS.SB] - [CYCLE_ACTIVITY.STALLS_MEM_ANY] ) /[cpu-cycles]"
},
{
"name": "metric_TMAM_......0_Port_Utilized(%)",
"name": "metric_TMA_......0_Port_Utilized(%)",
"expression": "100 * (([UOPS_EXECUTED.CORE_i1_c1] / [const_thread_count]) if ([const_thread_count] > 1) else ([RS_EVENTS.EMPTY_CYCLES] if ([CYCLE_ACTIVITY.STALLS_TOTAL] - ([IDQ_UOPS_NOT_DELIVERED.CYCLES_0_UOPS_DELIV.CORE] / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])) ) > 0.1 else 0)) / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]) "
},
{
"name": "metric_TMAM_......1_Port_Utilized(%)",
"name": "metric_TMA_......1_Port_Utilized(%)",
"expression": "100 * (([UOPS_EXECUTED.CORE_c1] - [UOPS_EXECUTED.CORE_c2]) / [const_thread_count]) / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_......2_Port_Utilized(%)",
"name": "metric_TMA_......2_Port_Utilized(%)",
"expression": "100 * (([UOPS_EXECUTED.CORE_c2] - [UOPS_EXECUTED.CORE_c3]) / [const_thread_count]) / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_......3m_Ports_Utilized(%)",
"name": "metric_TMA_......3m_Ports_Utilized(%)",
"expression": "100 * ([UOPS_EXECUTED.CORE_c3] / [const_thread_count]) / ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count])"
},
{
"name": "metric_TMAM_Retiring(%)",
"name": "metric_TMA_Retiring(%)",
"expression": "100 * [UOPS_RETIRED.RETIRE_SLOTS] / (4 * ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))"
},
{
"name": "metric_TMAM_..Base(%)",
"name": "metric_TMA_..Base(%)",
"expression": "100 *(([UOPS_RETIRED.RETIRE_SLOTS] / (4 * ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))) - (([UOPS_RETIRED.RETIRE_SLOTS] / [UOPS_ISSUED.ANY]) * [IDQ.MS_UOPS] / (4 * ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))))"
},
{
"name": "metric_TMAM_..Microcode_Sequencer(%)",
"name": "metric_TMA_..Microcode_Sequencer(%)",
"expression": "100 * (([UOPS_RETIRED.RETIRE_SLOTS] / [UOPS_ISSUED.ANY]) * [IDQ.MS_UOPS] )/ (4 * ([CPU_CLK_UNHALTED.THREAD_ANY] / [const_thread_count]))"
}
]
2 changes: 1 addition & 1 deletion events/metric_icx.json
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@
},
{
"name": "metric_LLC code read MPI (demand+prefetch)",
"expression": "([UNC_CHA_TOR_INSERTS.IA_MISS_LLCPREFCODE] + [UNC_CHA_TOR_INSERTS.IA_MISS_CRD] + [UNC_CHA_TOR_INSERTS.IA_MISS_CRD_PREF]) / [instructions]"
"expression": "([UNC_CHA_TOR_INSERTS.IA_MISS_CRD] + [UNC_CHA_TOR_INSERTS.IA_MISS_CRD_PREF]) / [instructions]"
},
{
"name": "metric_LLC data read MPI (demand+prefetch)",
Expand Down
Loading

0 comments on commit 1dfc3f9

Please sign in to comment.