Notice: We plan to share a new implementation of PerfSpect in the coming months. The new implementation of PerfSpect will enhance its functionality by offering "live" metric generation. This means that the current post-processing step will be optional. Additionally, PerfSpect will incorporate features from other Intel tools, streamlining their acquisition and deployment for users. Stay tuned!
PerfSpect is a system performance characterization tool built on top of linux perf. It contains two parts:
perf-collect: Collects hardware events at a 5 second output interval with practically zero overhead since PMU's run in counting mode.
- Collection mode:
sudo ./perf-collect
default system widesudo ./perf-collect --socket
sudo ./perf-collect --cpu
sudo ./perf-collect --pid <process-id>
sudo ./perf-collect --cid
by default, selects the 5 containers using the most CPU at start of perf-collect. To monitor specific containers provide up to 5 comma separated cids i.e. <cid_1>,<cid_2>
- Duration:
sudo ./perf-collect
default run until terminatedsudo ./perf-collect --timeout 10
run for 10 secondssudo ./perf-collect --app "myapp.sh myparameter"
runs for duration of another process
perf-postprocess: Calculates high level metrics from hardware events
./perf-postprocess
Quick start (requires perf installed)
wget -qO- https://github.com/intel/PerfSpect/releases/latest/download/perfspect.tgz | tar xvz
cd perfspect
sudo ./perf-collect --timeout 10
./perf-postprocess
Running perf-collect as a non-root user
As seen in the examples above, sudo
is the standard approach to running perf-collect with elevated privileges. If sudo
is not possible and running as the root user is not possible, then a user may request the following changes be made to the system by an administrator:
- sysctl -w kernel.perf_event_paranoid=0
- sysctl -w kernel.nmi_watchdog=0
- write '125' to all perf_event_mux_interval_ms files found under /sys/devices/*.
for i in $(find /sys/devices -name perf_event_mux_interval_ms); do echo 125 > $i; done
Recommend returning these settings to their prior values when analysis with PerfSpect is complete.
Output
perf-collect outputs:
perfstat.csv
: raw event counts with system metadata
perf-postprocess outputs:
metric_out.sys.average.csv
: average metricsmetric_out.sys.csv
: metric values at every 5 second intervalmetric_out.html
: html view of a few select metrics
Deploy in Kubernetes
Modify the template deamonset.yml to deploy in kubernetes
Requirements
perf - PerfSpect uses the Linux perf tool to collect PMU counters
Different events require different minimum kernels (PerfSpect will automatically collect only supported events)
- Base (CPU util, CPI, Cache misses, etc.)
- 3.10
- Uncore (NUMA traffic, DRAM traffic, etc.)
- 4.9
- TMA (Micro-architecture boundness breakdown)
- ICX, SPR: 5.10
- BDX, SKX, CLX: 3.10
Build from source
Requires recent python. On successful build, binaries will be created in dist
folder
pip3 install -r requirements.txt
make
Note: Most metrics and events come from perfmon and TMA v4.5