Wip/rx 7800 xt results #33

lamikr · 2025-01-31T08:30:38Z

RX 7800 XT results from Bazza-63

- AMD GPU benchmarks works on rocm platform (tested on rocm 6.1.1) and offers support of using CUDA api with AMD gpus - only code change is needed is to test.sh to check whether rocm-smi tool is available and whether it returns AMD GPUs - rocm-smi can be used to query the AMD gpu count - I had pytorch 2.3.0 and pytorch vision 0.18.0 installed with AMD rocm support enabled and all tests executed without errors on AMD RX 6800 Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

(cherry picked from commit 1d0d0aa)

- rename and rearragne resutls to be more consistently named and organized - separate multi-gpu results from single gpu results - use underscrore instead of space on names - each folder will only contain one set of benchmark results - separate amd and nvidia results to own subfolders Signed-off-by: Mika Laitio <[email protected]>

- allow specifying gpu-index parameter in addition of gpu-count parameter. - gpu index parameter can be used to request the benchmarks to be run only of certain gpu index in multi-gpu case - if more than one gpu, run benchmarks separately for each and then in the end run tests with all gpus used at a same time - fixes for: lamikr/rocm_sdk_builder#63 Signed-off-by: Mika Laitio <[email protected]>

use pytorch script to get device count instead of using amd or nvidia gpu specific commands and then execute the benchmark for each gpu found. Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

- defined minimal, medium and full set of models - use the amount of gpu memory as a criteria whether to run minimal. medium or full set of models (8GB device can run medium set without getting out of memory error) Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

- datafiles to be plotted must still be hardcoded in the code as a list - can show graphs that can be used to compare results from multiple amd and nvidia gpus - problem when showing the results is that the benchmarks that have been run on different GPU's have varied during the time. The list of benchmarks in full set is nowadays much bigger than on rx 1080/2060 release time. Signed-off-by: Mika Laitio <[email protected]>

- select more in detail which type of benchmarks are run depending from the amount of memery available (benchmarks executed for dounle type can be different than float type for example) Signed-off-by: Mika Laitio <[email protected]>

- add first benchmark results with the framework 16 laptop which has both the discrete 7700S gpu and internal M780M iGPU. - tests were executed with rocm sdk builder v6.1.2 version Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

- fixes the problem where the column resnet101 was by default before the resnet34 column Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

- latest rocm sdk builder has pytorch optimization in aotriton provided flash attenation algorithm for gfx1100/gfx1101/gfx1102 and gfx1103. Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

- improvements to logic that is used for selecting which models and precicious are benchmarked based on the amount of memory available on gpu Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

- trainign --> train - single --> float for couple of nvidia benchmarks which used different naming than other results Signed-off-by: Mika Laitio <[email protected]>

- execute at least 2 models also on small model case to allow showing line graphs easier Signed-off-by: Mika Laitio <[email protected]>

Signed-off-by: Mika Laitio <[email protected]>

lamikr and others added 28 commits June 7, 2024 12:39

AMD Radeon RX 6800 gpu benchmark results

bff98d0

Signed-off-by: Mika Laitio <[email protected]>

Add 7900 XTX results

721915a

(cherry picked from commit 1d0d0aa)

benchmarks on devices having both amd and nvidia gpus

4d1813f

use pytorch script to get device count instead of using amd or nvidia gpu specific commands and then execute the benchmark for each gpu found. Signed-off-by: Mika Laitio <[email protected]>

updated readme.sh

eb53c39

Signed-off-by: Mika Laitio <[email protected]>

do not run double precision benchmarks on small devices

80f382b

Signed-off-by: Mika Laitio <[email protected]>

benchmark selection improvements

fed5f82

- select more in detail which type of benchmarks are run depending from the amount of memery available (benchmarks executed for dounle type can be different than float type for example) Signed-off-by: Mika Laitio <[email protected]>

add amd framework 16 gpu benchmarks

fdaf038

- add first benchmark results with the framework 16 laptop which has both the discrete 7700S gpu and internal M780M iGPU. - tests were executed with rocm sdk builder v6.1.2 version Signed-off-by: Mika Laitio <[email protected]>

small readme update

75b52f6

Signed-off-by: Mika Laitio <[email protected]>

readme updates

e85f6b6

Signed-off-by: Mika Laitio <[email protected]>

add AMD 680M iGPU gfx1035 results with 6.10.2 kernel

9eec081

Signed-off-by: Mika Laitio <[email protected]>

sort columns by the number that follows the string

cce0e76

- fixes the problem where the column resnet101 was by default before the resnet34 column Signed-off-by: Mika Laitio <[email protected]>

add RX7700S/gfx1102 results with flash attenation optimization

87112fe

Signed-off-by: Mika Laitio <[email protected]>

show benchmark results with new pytorch optimization for 7700S

4bab725

- latest rocm sdk builder has pytorch optimization in aotriton provided flash attenation algorithm for gfx1100/gfx1101/gfx1102 and gfx1103. Signed-off-by: Mika Laitio <[email protected]>

update the readme

1401301

Signed-off-by: Mika Laitio <[email protected]>

reduce the tests run on minimal setup

ca5c1a4

Signed-off-by: Mika Laitio <[email protected]>

benchmark code cleanups

931900b

- improvements to logic that is used for selecting which models and precicious are benchmarked based on the amount of memory available on gpu Signed-off-by: Mika Laitio <[email protected]>

update model selection logic

39dd8c2

Signed-off-by: Mika Laitio <[email protected]>

add AMD Radeon RX 5700 benchmarks

21387fe

Signed-off-by: Mika Laitio <[email protected]>

more benchmark model selection code cleanups

eeb9d2c

Signed-off-by: Mika Laitio <[email protected]>

sync result names

89c9524

- trainign --> train - single --> float for couple of nvidia benchmarks which used different naming than other results Signed-off-by: Mika Laitio <[email protected]>

small benchmark model selection updates

d877616

- execute at least 2 models also on small model case to allow showing line graphs easier Signed-off-by: Mika Laitio <[email protected]>

benchmark printout improvements

bef7e5e

Signed-off-by: Mika Laitio <[email protected]>

add AMD RX7800_XT results from Bazza-63

6a65354

Signed-off-by: Mika Laitio <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wip/rx 7800 xt results #33

Wip/rx 7800 xt results #33

Uh oh!

lamikr commented Jan 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Wip/rx 7800 xt results #33

Are you sure you want to change the base?

Wip/rx 7800 xt results #33

Uh oh!

Conversation

lamikr commented Jan 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants