Skip to content

Latest commit

 

History

History
84 lines (58 loc) · 3.82 KB

GetStarted.md

File metadata and controls

84 lines (58 loc) · 3.82 KB

Get started

How to set up the environment

Here is the list of tools you absolutely have to install to build labs in this video course:

Others are optional depend on your platform of choice. So far we support native builds on Windows and Linux. Check out the instructions specific to each platform: Windows, Linux (TODO: add instructions).

How to build lab assignments

Watch the warmup video:

Every lab assignment has the following:

  • Video that introduces a particular transformation.
  • Baseline version of a workload that has a particular performance bottleneck in it. You need to find it and fix the source code accordingly.
  • Summary video that explains the solution for the lab.

We encourage you to work on the lab assignment first, without watching the summary video.

Every lab can be built and ran using the following commands:

cmake -E make_directory build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --config Release --parallel 8
cmake --build . --target validateLab
cmake --build . --target benchmarkLab

When you push changes to your private branch, it will automatically trigger CI benchmarking job. More details about it at the bottom of the page.

Profiling

Lab assignments are build on top of Google Benchmark library, which by default performs a variable number of benchmark iterations. That makes it hard to compare performance profiles of two runs since they will not do the same amount of work. You can see the same wall time even though the number of iterations is different. To fix the number of iterations, you can do the following change:

  BENCHMARK(bench1)->Iterations(10);

This will instruct the Google Benchmark framework to execute exactly 10 iterations of the benchmark. Now when you improve your code you can also compare performance profiles since the wall time will be different.

Target platforms

You are free to work on whatever platform you have at your disposal. However, we use the following CI machines to run your submissions:

Machine 1

  • Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz, 6MB L3-cache
  • 8 GB RAM
  • Ubuntu 20.04
  • Clang C++ compiler, version 12.0

Machine 2

  • AMD Ryzen 7 3700X 8-Core Processor @ 3.6GHz, 32MB L3-cache
  • 64 GB RAM
  • Windows 11 Version 21H2, build 22000.282
  • Clang C++ compiler, version 12.0

Keep in mind that sometimes you may see different speedups on different platforms.

Submission guidelines:

IMPORTANT: Send a request to be added as a collaborator to this Github repo. Otherwise, you won't be able to push your private branch[es]. Send your github handle to [email protected] with the topic "[PerfNinjaAccessRequest]". Do not fork the repo and submit a pull request with your solution, the CI job won't be triggered.

Push your submissions into your own branch[es]. CI job will be triggered every time you push changes to your remote Github branch. For now, we use a self-hosted runner, which is configured specifically for benchmarking purposes.

By default, CI will detect which lab was modified in the last commit and will only benchmark affected asignment. If you make changes to more than one lab, CI job will benchmark all the labs. You can also force benchmarking all the labs if you add [CheckAll] in the commit message.

In case all the labs were benchmarked, summary will be provided at the end, e.g.:

Lab Assignments Summary:
  memory_bound:
    data_packing: Passed
    sequential_accesses: Failed: not fast enough
  core_bound:
    function_inlining: Failed: build error
  misc:
    warmup: Skipped