Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task07 Ilya Bolkisev ITMO #339

Closed
wants to merge 1 commit into from

Conversation

IlyaBolkisev
Copy link

@IlyaBolkisev IlyaBolkisev commented Jan 14, 2025

Локальный вывод

/home/realist/CLionProjects/GPGPUTasks2024/cmake-build-debug/prefix_sum 1
______________________________________________
n=4096 values in range: [0; 1023]
CPU: 2.4e-05+-0 s
CPU: 170.667 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.000163333+-2.86744e-06 s
GPU [work-efficient]: 25.0776 millions/s
______________________________________________
n=16384 values in range: [0; 1023]
CPU: 9.8e-05+-0 s
CPU: 167.184 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.000183167+-1.77169e-06 s
GPU [work-efficient]: 89.4486 millions/s
______________________________________________
n=65536 values in range: [0; 1023]
CPU: 0.000393833+-8.97527e-07 s
CPU: 166.405 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.000216833+-6.87184e-07 s
GPU [work-efficient]: 302.241 millions/s
______________________________________________
n=262144 values in range: [0; 1023]
CPU: 0.00157533+-9.42809e-07 s
CPU: 166.405 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.0002565+-5e-07 s
GPU [work-efficient]: 1022 millions/s
______________________________________________
n=1048576 values in range: [0; 1023]
CPU: 0.006378+-2.76887e-06 s
CPU: 164.405 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.000359667+-7.45356e-07 s
GPU [work-efficient]: 2915.41 millions/s
______________________________________________
n=4194304 values in range: [0; 511]
CPU: 0.0254692+-5.01387e-06 s
CPU: 164.682 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.00123733+-3.72678e-06 s
GPU [work-efficient]: 3389.79 millions/s
______________________________________________
n=16777216 values in range: [0; 127]
CPU: 0.101914+-2.89175e-05 s
CPU: 164.622 millions/s
OpenCL devices:
  Device #0: CPU. Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz. Intel(R) Corporation. Total memory: 32028 Mb
  Device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
Using device #1: GPU. NVIDIA GeForce RTX 2070 SUPER. Total memory: 7973 Mb
GPU [work-efficient]: 0.005687+-0.00046877 s
GPU [work-efficient]: 2950.1 millions/s

Вывод Github CI

Run ./prefix_sum
______________________________________________
n=4096 values in range: [0; 1023]
CPU: 6.83333e-06+-3.72678e-07 s
CPU: 599.415 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.000220167+-6.30916e-06 s
GPU [work-efficient]: 18.6041 millions/s
______________________________________________
n=16384 values in range: [0; 1023]
CPU: 2e-05+-2.23607e-06 s
CPU: 819.2 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.000299833+-8.13258e-06 s
GPU [work-efficient]: 54.6437 millions/s
______________________________________________
n=65536 values in range: [0; 1023]
CPU: 7.8e-05+-9.09495e-13 s
CPU: 840.205 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.000465333+-1.37194e-05 s
GPU [work-efficient]: 140.837 millions/s
______________________________________________
n=262144 values in range: [0; 1023]
CPU: 0.000209167+-1.58368e-05 s
CPU: 1253.28 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.000868+-1.64114e-05 s
GPU [work-efficient]: 302.009 millions/s
______________________________________________
n=1048576 values in range: [0; 1023]
CPU: 0.000705+-2.34947e-05 s
CPU: 1487.34 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.0020765+-4.11613e-05 s
GPU [work-efficient]: 504.973 millions/s
______________________________________________
n=4194304 values in range: [0; 511]
CPU: 0.00294367+-0.00011394 s
CPU: 1424.86 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.007966+-0.00230354 s
GPU [work-efficient]: 526.526 millions/s
______________________________________________
n=16777216 values in range: [0; 127]
CPU: 0.0400212+-0.000254716 s
CPU: 419.209 millions/s
OpenCL devices:
  Device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
Using device #0: CPU. AMD EPYC 7763 64-Core Processor                . Intel(R) Corporation. Total memory: 15991 Mb
GPU [work-efficient]: 0.0571475+-0.00369988 s
GPU [work-efficient]: 293.577 millions/s

sum[i] += sum[i - (rate >> 1)];
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

for (unsigned int rate = n / 2; rate >= 2; rate /= 2)
prefix_sum_second_part.exec(gpu::WorkSize(64, (n + rate - 1) / rate),
gpu, n, rate);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

@simiyutin
Copy link
Collaborator

Задача считается списанной, засчитана на ноль баллов

@simiyutin simiyutin closed this Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants