Skip to content

Releases: intel/PerfSpect

v3.10.0

03 Sep 21:50
Compare
Choose a tag to compare

v3.10.0 is a feature and maintenance release

What's New

  • the 'All Metrics' tab in the metrics command's HTML report now includes definitions for every metric, highlights metrics that exceed a threshold, and provides context and/or a tip when metric is highlighted.
  • the report command's Gaudi table now includes the Gaudi microarchitecture (Gaudi 1/2/3)
  • The metrics command now produces a system-level HTML summary report when data is collected with granularity set to socket or cpu and when scope is set to cgroup. This is in addition to the HTML summary report already produced at system granularity.

What's Fixed

  • report command's JSON output format now presents an empty data set as empty list '[]' instead of a record with empty values
  • metrics command fixed on RHEL-9

Full Changelog: v3.9.1...v3.10.0

v3.9.1

20 Aug 19:38
18fa4c9
Compare
Choose a tag to compare

v3.9.1 is a maintenance release, bug fixes only.

Issues Addressed:
#460 - some telemetry categories not reported if system is configured for 12 hour time format
#463 - perf: Argument list too long
#466 - metrics with --cpus option sometimes errors

Full Changelog: v3.9.0...v3.9.1

v3.9.0

15 Aug 00:22
e3811ca
Compare
Choose a tag to compare

What's Changed

New Features, Changes, and Enhancements

  • added support for collecting metrics on a specific set of cpus with the new --cpus flag
  • metrics multi-unit, e.g., cgroup, summary CSV reformatted for easier parsing
  • add flag for instruction mix frequency (--instrmix-frequency) in telemetry command and lower default setting to decrease default overhead
  • support for cri-containerd in metrics command cgroup scope
  • update memory benchmarks for more accuracy on systems with larger L3 cache
  • simplify time format, add system type, and use AMD specific labels for HT (SMT) and Turbo (Boost) in brief report system summary field
  • add version to system, base board, and chassis in report command's host and system-summary tables
  • PerfSpect now includes additional tools used for data collection on remote ARM targets. Extends data collected by report. Enables the telemetry and flame commands. Note: the metrics command is not currently supported.
  • PerfSpect can now also be built to run directly on an ARM target (remote collection no longer required).

Fixes

  • updated the event groups in the metrics command for GNR when the topdown fixed-purpose counter is not available
  • core temperature and frequency now shown in telemetry when no uncore access available
  • report L3 size for AMD Turin correctly and report cache sizes in MB and per socket, consistently
  • fix hyperthreading enabled/disabled reporting error in report when more than half of cores are off-lined
  • CXL devices now correctly listed in report
  • accelerator table insights in report corrected
  • address race condition setting uncore frequencies in config command
  • fix check for enough available storage space in report command storage benchmark

Full Changelog: v3.8.0...v3.9.0

v3.8.0

05 Jul 23:13
77769a6
Compare
Choose a tag to compare

What's Changed

Version 3.8.0 is a feature and maintenance release

New Features, Changes, and Enhancements

  • metrics command now supports Intel Granite Rapids processors on Google Cloud (C4 instances)
  • metrics command's TMA metrics for Granite Rapids updated
  • Network IRQs table format improved to avoid one long line of data by adding separators that will allow wrapping
  • metrics command no longer errors and exits when the PMU is determined to be in use, warning is generated instead
  • Intel Clearwater Forest now recognized and identified by report and config commands
  • Intel Granite Rapids D now recognized and identified by report and config commands
  • Intel Arrow Lake CPUs now recognized and identified by report command
  • AWS Graviton 4, ARM Neoverse-V2 CPUs now recognized and identified by report command

Fixes

  • power and temperature benchmarks in report command now works on additional architectures by fixing turbostat output parsing
  • NIC table fixed in report command
  • race condition in config command fixed when setting multiple configuration options at the same time
  • memory benchmark in report command fixed when output format changed in newer MLC release
  • frequency benchmark in report command fixed when number of cores per die differs per die

Full Changelog: v3.7.0...v3.8.0

v3.7.0

06 Jun 22:42
e0dec47
Compare
Choose a tag to compare

What's Changed

Version 3.7.0 is a feature and maintenance release.

To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.

New Features and Enhancements

  • the metrics HTML report now supports comparing two sets of metrics
  • metrics command can optionally expose a Prometheus compatible metrics endpoint using --prometheus-server and --prometheus-server-addr
  • flame command can now target multiple PIDs using --pids
  • flame command can now control the depth of the call stack using --max-depth
  • eliminated the requirement to have Perl installed on the target for the flame command
  • config command can now enable/disable c6 and c1-demotion
  • config command can now configure LLC size on SRF and GNR
  • config command can now enable/disable LLC prefetcher on SRF
  • telemetry command now reports CPU temperature, IPC and C6 residency
  • report command now includes vendor and model ID in the NIC table
  • logs can now be directed to stdout using --log-stdout; useful when combined with the metrics prometheus server feature
  • metrics command "PMU in use" error and exit changed to a warning

Fixes

  • address problems found with collecting metrics for cgroups
  • fix memory benchmark chart X-axis label from MB/s to GB/s
  • fix index out of range error in renderXlsxTableMultiTarget
  • fix determination of availability of fixed counters

Full Changelog: v3.6.1...v3.7.0

v3.6.1

29 Apr 00:48
4282e57
Compare
Choose a tag to compare

Version 3.6.1 fixes a bug found in 3.6.0 when parsing non-padded HEX values for CPU frequencies.

To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.

Full Changelog: v3.6.0...v3.6.1

v3.6.0

28 Apr 17:20
69aadee
Compare
Choose a tag to compare

Version 3.6.0 is a feature and maintenance release.

To install, download and extract the pre-built package (perfspect.tgz) from the Assets listed below.

New Features & Enhancements

  • The CPU frequency table from the report command now includes frequencies for SSE, AVX2, AVX512, and AMX, when supported by architecture
  • Flamegraphs can now be limited to a specific process (PID)
  • Prefetchers can be enabled/disabled with the config command
  • A brief system configuration summary table has been added to the metrics, flame, lock, and telemetry reports
  • Added preliminary support for the Intel Clearwater Forest CPU architecture
  • The lock command can now retrieve a binary perf package that can be used for analysis off the target
  • Added support for metrics, including per-transaction metrics, on EC2 m7a (AMD Genoa) and AMD Turin

Fixes

  • The config command can now set the max core frequency on SRF and GNR
  • The targets.yaml file no longer requires a value for the target name field

Breaking Changes

  • Some flags for the config command have been renamed for consistency and readability. See perfspect config -h.

Full Changelog: v3.5.0...v3.6.0

v3.5.2

15 Apr 23:22
Compare
Choose a tag to compare

v3.5.2 is a bug-fix release (Note: v3.5.1 was a bad build/release and has since been deleted)

Two issues were found in 3.5.0 and are now fixed in 3.5.2.

  • perfspect will exit with a panic when an incorrect command line argument is presented
  • perfspect will exit with an error when falsely identifying the temp directory as being located on a file system mounted with 'noexec'

Full Changelog: v3.5.0...v3.5.2

v3.5.0

03 Apr 17:00
0494bec
Compare
Choose a tag to compare

Version 3.5.0 is a feature and maintenance release with the following additions/fixes.

Breaking Change

  • The --targettemp flag has been removed. Use the --tempdir flag to override the directory where collection scripts are executed.

New Features & Enhancements

  • The --txnrate flag used with the metrics command now augments the metrics list with transaction-oriented metrics rather than replacing existing metrics.
  • The --syslog flag redirects log output to the local syslog daemon. This is useful when running PerfSpect for long durations and/or running as a CRON job.
  • Improved shutdown when PerfSpect receives SIGINT (ctrl-c).
  • Added GNR prefetcher settings to report.
  • Added clustering mode (SNC, UMA) for GNR and SRF to report.
  • Added CPU frequency chart to telemetry report.
  • Added table of network-related kernel parameters to report.
  • Added TME (total memory encryption) on/off to report.
  • Added TMA level 1 over time chart to the metrics HTML report.
  • Added configured DIMM speed and DIMM rank to DIMM table.

Fixes

  • Addressed incorrect measured CPU frequency chart on GNR when SNC is disabled.
  • Addressed missing NIC information in report.
  • Addressed error when /tmp is on a file system mounted with 'noexec' (use --tempdir to override).
  • Addressed incorrect memory channels listed for SRF-AP.

Full Changelog: v3.4.0...v3.5.0

v3.4.0

04 Mar 21:48
Compare
Choose a tag to compare

Version 3.4.0 is a feature and maintenance release with the following additions/fixes.

New Features & Enhancements

  • Gaudi device stats now included in the telemetry command report.
  • Metrics command event data can now be re-processed so that a previously unknown transaction rate (--txnrate) can be applied.
  • The telemetry command now accepts a duration value of zero (--duration 0) to run until interrupted by SIGINT (ctrl-c).
  • The telemetry command HTML report now includes time stamps on the x-axis of charts.
  • The config command now allows setting the compute and I/O die frequencies independently (SRF and GNR)
  • The branch misprediction metric was added to the metrics report.
  • The report command now includes the Speed Select Technology frequency table when it is enabled.
  • Added insight entry to report command to warn when ELC is configured in latency-optimized mode and EPB is non-zero.
  • The report and config commands now determine which EPB configuration value (OS or BIOS) is active and report and/or change the appropriate entry.
  • Report command tables that are not relevant to a given CPU architecture are now not include in the output.

Fixes

  • L3 per core reported by the report command was inaccurate on some CPU architectures
  • On multi-socket systems where a socket has been disabled via BIOS, the microarchitecture may be reported incorrectly.

What's Changed

Full Changelog: v3.3.1...v3.4.0