Skip to content

Commit

Permalink
GPA 3.9 updates
Browse files Browse the repository at this point in the history
  • Loading branch information
PLohrmannAMD committed Jul 28, 2021
1 parent 3642849 commit 6faa062
Show file tree
Hide file tree
Showing 43 changed files with 354 additions and 247 deletions.
195 changes: 165 additions & 30 deletions NOTICES.txt

Large diffs are not rendered by default.

29 changes: 13 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,19 +31,13 @@ Prebuilt binaries can be downloaded from the Releases page: https://github.com/G
* Provides access to some raw hardware counters. See [Raw Hardware Counters](#raw-hardware-counters) for more information.

## What's New
* Version 3.8 (04/01/21)
* Add support for additional GPUs and APUs, including AMD Radeon™ RX 6700 series GPUs.
* Code has been updated to adhere to Google C++ Style Guide.
* New public headers have been added.
* Old headers are deprecated and will emit compile-time message.
* Projects loading GPA will need to be recompiled, but no code changes are required unless moving to the new headers.
* Improvements made to sample applications.
* Updated documentation for new codestyle (and https://github.com/GPUOpen-Tools/gpu_performance_api/issues/56)
* Support for the --internal flag to has been removed from the build script.
* Version 3.9 (07/27/21)
* Add support for additional GPUs and APUs.
* Improvements made to the sample applications.

## System Requirements
* An AMD Radeon GPU or APU based on Graphics IP version 8 and newer.
* Windows: Radeon Software Adrenaline 2020 Edition 20.11.2 or later (Driver Packaging Version 20.45 or later).
* Windows: Radeon Software Adrenalin 2020 Edition 20.11.2 or later (Driver Packaging Version 20.45 or later).
* Linux: Radeon Software for Linux Revision 20.45 or later.
* Radeon GPUs or APUs based on Graphics IP version 6 and 7 are no longer supported by GPUPerfAPI. Please use an older version ([3.3](https://github.com/GPUOpen-Tools/gpu_performance_api/releases/tag/v3.3)) with older hardware.
* Windows 7, 8.1, and 10.
Expand Down Expand Up @@ -90,8 +84,8 @@ This version allows you to access the raw hardware counters by simply specifying
### Ubuntu 20.04 LTS Vulkan ICD Issue
On Ubuntu 20.04 LTS, Vulkan ICD may not be set to use AMD Vulkan ICD. In this case, it needs to be explicitly set to use AMD Vulkan ICD before using the GPA. It can be done by setting the ```VK_ICD_FILENAMES``` environment variable to ```/etc/vulkan/icd.d/amd_icd64.json```.

### OpenGL Fetchsize Counter on Radeon RX 6000
FetchSize counter will show an error when enabled on Radeon RX 6000 Series GPU using OpenGL.
### OpenGL FetchSize Counter on Radeon RX 6000 Series GPUs
FetchSize counter will show an error when enabled on Radeon RX 6000 Series GPUs using OpenGL.

### Adjusting Linux Clock Mode
Adjusting the GPU clock mode on Linux is accomplished by writing to: ```/sys/class/drm/card\<N\>/device/power_dpm_force_performance_level```, where \<N\> is the index of the card in question.
Expand All @@ -103,12 +97,16 @@ By default this file is only modifiable by root, so the application being profil
* Setting the GPU clock mode is not working correctly for <b>Radeon 5700 Series GPUs</b>, potentially leading to some inconsistencies in counter values from one run to the next.

### DirectX11 Performance Counter Accuracy For Select Counters and GPUs
The following performance counter values may not be accurate for DirectX 11 applications running on a Radeon 5700, and 6000 Series GPU.
The following performance counter values may not be accurate for DirectX 11 applications running on a Radeon 5700, and 6000 Series GPUs:
* VALUInstCount, SALUInstCount, VALUBusy, SALUBusy for all shader stages: These values should be representative of performance, but may not be 100% accurate.
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.

### OpenGL Performance Counter Accuracy For Radeon 7500
The following performance counter values may not be accurate for OpenGL applications running on a Radeon 5700 Series GPU.
### OpenCL Performance Counter Accuracy For Radeon 6000 Series GPUs
The following performance counter values may not be accurate for OpenCL applications running on Radeon 6000 Series GPUs:
* Wavefronts, VALUInsts, SALUInsts, SALUBusy, VALUUtilization: These values should be representative of performance, but may not be 100% accurate.

### OpenGL Performance Counter Accuracy For Radeon 5700 Series GPUs
The following performance counter values may not be accurate for OpenGL applications running on a Radeon 5700 Series GPUs:
* Most of the ComputeShader counters (all except the MemUnit and WriteUnit counters): These values should be representative of performance, but may not be 100% accurate.

### Variability in Deterministic Counters For Select GPUs
Expand All @@ -119,7 +117,6 @@ Performance counters which should be deterministic are showing variability on Ra
Profiling bundles in DirectX12 and Vulkan is not working properly. It is recommended to remove those GPA Samples from your application, or move the calls out of the bundle for profiling.

## Style and Format Change

The source code of this product is being reformatted to follow the Google C++ Style Guide https://google.github.io/styleguide/cppguide.html.
In the interim you may encounter a mix of both an older C++ coding style, as well as the newer Google C++ Style.
Please refer to the _clang-format file in the root directory of the product for additional style information.
6 changes: 5 additions & 1 deletion ReleaseNotes.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# GPU Performance API Release Notes
---

## Version 3.9 (07/27/21)
* Add support for additional GPUs and APUs.
* Improvements made to the sample applications.

## Version 3.8 (04/01/21)
* Add support for additional GPUs and APUs, including AMD Radeon™ RX 6700 series GPUs.
* Code has been updated to adhere to Google C++ Style Guide.
Expand Down Expand Up @@ -234,4 +238,4 @@
* Supports OpenCL™ on ATI Radeon 4000 and 5000 series.
* Provides derived counters based on raw HW performance counters.
* Manages memory automatically – no allocations required.
* Requires ATI Catalyst driver 10.1 or later.
* Requires ATI Catalyst driver 10.1 or later.
4 changes: 2 additions & 2 deletions build/cmake_modules/defs.cmake
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
## Copyright (c) 2018-2020 Advanced Micro Devices, Inc. All rights reserved.
## Copyright (c) 2018-2021 Advanced Micro Devices, Inc. All rights reserved.
cmake_minimum_required(VERSION 3.5.1)

## Define the GPA version
set(GPA_MAJOR_VERSION 3)
set(GPA_MINOR_VERSION 8)
set(GPA_MINOR_VERSION 9)
set(GPA_UPDATE_VERSION 0)

if(NOT DEFINED GPA_BUILD_NUMBER)
Expand Down
2 changes: 1 addition & 1 deletion docs/doxygen/DoxyfilePublic
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ PROJECT_NAME = "GPU Perf API"
# This could be handy for archiving the generated documentation or
# if some version control system is used.

PROJECT_NUMBER = 2.11
PROJECT_NUMBER = 3.9

# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute)
# base path where the generated documentation will be put.
Expand Down
4 changes: 2 additions & 2 deletions docs/sphinx/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,9 +61,9 @@
# built documents.
#
# The short X.Y version.
version = u'3.8'
version = u'3.9'
# The full version, including alpha/beta/rc tags.
release = u'3.8'
release = u'3.9'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
64 changes: 64 additions & 0 deletions scripts/enable_set_device_clock_android.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/usr/bin/python3
##=============================================================================
## Copyright (c) 2019-2021 Advanced Micro Devices, Inc. All rights reserved.
## \author AMD Developer Tools Team
## \file
## \brief Script to change the file permission for device clock setting on Android
##=============================================================================
#
import argparse
import os.path
import sys
import subprocess
import platform

parser = argparse.ArgumentParser(description="Changes the file permission for device clock setting on Android")
parser.add_argument("-t", "--target", required=True, help="IP address of Android device")
parser.add_argument("-d", "--device", help="adb device ID, as seen in 'adb devices'. This or -i must be specified when multiple devices are available.")
parser.add_argument("-i", "--infer", action="store", metavar="PORT", nargs="?", const="5555", help="Specify this when using ADB over TCP/IP, to infer the -d argument from the -t one. This option saves you from having to specify the device IP address twice. Port 5555 is used if no port is specified. This or -d must be specified when multiple devices are available.")
args = parser.parse_args()

# Check that we have adb in PATH, that it sees a device and the device is
# in a usable state
proc = subprocess.Popen(["adb", "--help"], stdout=subprocess.DEVNULL)
proc.communicate()
if proc.returncode != 0:
sys.stderr.write("Error: adb not in PATH\n")
exit(1)

proc = subprocess.Popen(["adb", "devices"], stdout=subprocess.PIPE)
output = proc.communicate()[0].decode()
if proc.returncode != 0:
sys.stderr.write("Error: unexpected error querying adb devices\n")
exit(1)

lines = output.strip().splitlines()
if lines[0] != "List of devices attached":
sys.stderr.write("Error: unexpected output from 'adb devices'\n")
exit(1)

if len(lines) > 2 and not (args.device or args.infer):
sys.stderr.write("Error: Multiple Android devices available; must specify one with -d or -i\n")
for line in lines[1:]:
sys.stderr.write(" {}\n".format(line))
exit(1)

device = args.device
if device is None and args.infer:
device = args.target + ":" + args.infer

if device is None:
proc = subprocess.Popen(["adb", "get-serialno"], stdout=subprocess.PIPE)
device = proc.communicate()[0].decode().strip()

proc = subprocess.Popen(["adb", "-s", device, "get-state"], stdout=subprocess.PIPE)
output = proc.communicate()[0].decode().strip()
if proc.returncode != 0:
# No need for explicit message; adb already reported device not found
exit(1)
if output != "device":
sys.stderr.write("Error: adb device is not in a usable state\n")
exit(1)

# sysfs permission needs to be changed:
subprocess.call(["adb", "-s", device, "shell", "su", "root", "chmod", "766", "/sys/class/drm/card0/device/power_dpm_force_performance_level"])
97 changes: 0 additions & 97 deletions scripts/enable_set_device_clock_android.sh

This file was deleted.

3 changes: 2 additions & 1 deletion scripts/gpa_packaging.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,6 +195,7 @@ def CreatePackage(self, archive_output_dir, build_artifacts_dir, sphinx_docs_dir
if android == True:
_android_device_connect_script_file = os.path.normpath(os.path.join(self._gpa_root_dir, "scripts", self._android_device_connect_script))
_android_device_connect_script_file_in_archive = os.path.normpath(os.path.join(gpa_archive_root_name, self._android_device_connect_script))

GpaUtils.WriteFileToArchive(gpa_archive_handle, _android_device_connect_script_file,
_android_device_connect_script_file_in_archive)

Expand Down Expand Up @@ -257,7 +258,7 @@ def __init__(self):
"LICENSE"]

_version_file="source/gpu_perf_api_common/gpa_version.h"
_android_device_connect_script = "enable_set_device_clock_android.sh"
_android_device_connect_script = "enable_set_device_clock_android.py"
_major_version=0
_minor_version=0
_update_version=0
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ set(CMAKE_INCLUDE_CURRENT_DIR ON)
set(HW_COUNTER_HEADERS
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx10.h
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx103.h
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx103_gfx1031.h
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx103_gfx1031_gfx1032.h
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx8.h
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx8_baffin.h
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx8_carrizo.h
Expand All @@ -21,7 +21,7 @@ ${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx9_placeholder4.h)
set(HW_COUNTER_SRC
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx10.cc
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx103.cc
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx103_gfx1031.cc
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx103_gfx1031_gfx1032.cc
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx8.cc
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx8_baffin.cc
${CMAKE_CURRENT_LIST_DIR}/gpa_hw_counter_gfx8_carrizo.cc
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
// Copyright (c) 2010-2021 Advanced Micro Devices, Inc. All rights reserved.
/// @author AMD Developer Tools Team
/// @file
/// @brief Hardware counter info for GFX103_GFX1031.
/// @brief Hardware counter info for GFX103_GFX1031_GFX1032.
//==============================================================================

// This file is autogenerated by the ConvertHWEnums project.
Expand All @@ -12,11 +12,11 @@
#include <set>

#include "gpu_perf_api_counter_generator/gpa_counter.h"
#include "auto_generated/gpu_perf_api_counter_generator/gpa_hw_counter_gfx103_gfx1031.h"
#include "auto_generated/gpu_perf_api_counter_generator/gpa_hw_counter_gfx103_gfx1031_gfx1032.h"
#include "gpu_performance_api/gpu_perf_api_types.h"
namespace counter_gfx103_gfx1031
namespace counter_gfx103_gfx1031_gfx1032
{
GpaHardwareCounterDesc kGcrCountersGfx103_gfx1031[] = {
GpaHardwareCounterDesc kGcrCountersGfx103_gfx1031_gfx1032[] = {
{0, GPA_HIDE_NAME("GCR_000"), GPA_HIDE_NAME("GCR"), GPA_HIDE_NAME("Counter 0 from group GCR"), kGpaDataTypeUint64, 0, GPA_UINT64_MAX},
{1, GPA_HIDE_NAME("GCR_001"), GPA_HIDE_NAME("GCR"), GPA_HIDE_NAME("Counter 1 from group GCR"), kGpaDataTypeUint64, 0, GPA_UINT64_MAX},
{2, GPA_HIDE_NAME("GCR_002"), GPA_HIDE_NAME("GCR"), GPA_HIDE_NAME("Counter 2 from group GCR"), kGpaDataTypeUint64, 0, GPA_UINT64_MAX},
Expand Down Expand Up @@ -128,6 +128,6 @@ namespace counter_gfx103_gfx1031
{108, GPA_HIDE_NAME("GCR_108"), GPA_HIDE_NAME("GCR"), GPA_HIDE_NAME("Counter 108 from group GCR"), kGpaDataTypeUint64, 0, GPA_UINT64_MAX},
{109, GPA_HIDE_NAME("GCR_109"), GPA_HIDE_NAME("GCR"), GPA_HIDE_NAME("Counter 109 from group GCR"), kGpaDataTypeUint64, 0, GPA_UINT64_MAX},
};
} // counter_gfx103_gfx1031
} // counter_gfx103_gfx1031_gfx1032

// clang-format on
Loading

0 comments on commit 6faa062

Please sign in to comment.