Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 11 additions & 9 deletions Intro_Tutorial/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,17 @@
# RAJA Portability Suite Intro Tutorial

Welcome to the RAJA Portability Suite Intro tutorial. In this tutorial you will learn
how to write an simple application that can target different hardware
architectures using the RAJA and Umpire libraries.
Welcome to the RAJA Portability Suite Intro tutorial. In this tutorial, you
will learn how to use RAJA and Umpire to write simple platform portable code
that can be compiled to target different hardware architectures.

## Lessons

You can find lessons in the lessons subdirectory. Each lesson has a README file
which will introduce new concepts and provide instructions to move forward.

Each lesson builds upon the previous one, so if you get stuck, you can look at
the next lesson to see the complete code. Additionally, some tutorials have
solutions folder with a provided solution.
Lessons are in the `lessons` subdirectory. Each lesson has a README file
that introduces new concepts and provides instructions to complete the lesson.
Each lesson builds on the previous ones to allow you to practice using RAJA
and Umpire capabilities and to reinforce the content.

Lessons contain source files with missing code and instructions for you to fill
in the missing parts along with solution files that contain the completed
lesson code. If you get stuck, you can diff the lesson and solution files to see
the code that the lesson is asking you to fill in.
56 changes: 29 additions & 27 deletions Intro_Tutorial/lessons/01_blt_cmake/README.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,48 @@
# Lesson 1
# Lesson 1: BLT and CMake

In this lesson you will learn how to use BLT and CMake to build an executable.
In this lesson, you will learn how to use BLT and CMake to configure a
software project to build. RAJA and Umpire use BLT and CMake as their build
systems, and so does this tutorial.

RAJA and Umpire use BLT and CMake as their build systems, and we recommend them
for other applications, like this tutorial! CMake uses information in a set of
CMakeLists.txt files to generate files to build your project. In this case,
we will be using the `make` program to actually compile everything.
[CMake](https://cmake.org/) is a common tool for building C++ projects and is
used throughout the world. It uses information in `CMakeLists.txt` files located in a software project to generate configuration files to build a code project
on a particular system. Then, a utility like `make` can be used to compile
the code.

BLT provides a set of CMake macros that make it easy to write CMake code for HPC
applications targetting multiple hardware architectures.
[BLT](https://github.com/LLNL/blt) provides a foundation of CMake macros and
other tools that simplify the process of Building, Linking, and Testing high
performance computing (HPC) applications. In particular, BLT establishes best
practices for using CMake.

We won't give you a full CMake/BLT tutorial here, just enough to get things moving.
The goal with this lesson is not to give you a full CMake/BLT tutorial. We
want to give you enough information to help get you started configuring and
building the code in this tutorial.

Our top-level CMakeLists.txt file describes the project, sets up some options,
and then calls `add_subdirectory` so that CMake looks for more CMakeLists.txt
files.
Our top-level [CMakeLists.txt file](https://github.com/LLNL/raja-suite-tutorial/blob/main/CMakeLists.txt) describes this project, sets some options,
and then calls `add_subdirectory`, telling CMake to look in sub-directories for
more CMakeLists.txt files.

https://github.com/LLNL/raja-suite-tutorial/blob/main/CMakeLists.txt

In this lesson directory, we have a CMakeLists.txt file that will describe our
In this lesson directory, we have a CMakeLists.txt file that describes our
application. We use the `blt_add_executable` macro to do this.

The macro takes two (or more) arguments, and the two we care about at the moment
are `NAME` where you provide the executable name, and `SOURCES` where you list
all the source code files that make up your application:
The macro takes two (or more) arguments, and the two most important
are `NAME` where you provide the name of the executable to be generated, and
`SOURCES` where you list all the source code files to compile to generate the
executable:

```
blt_add_executable(
NAME 01_blt_cmake
SOURCES 01_blt_cmake.cpp)
```

For now, we have filled these out for you, but in later lessons you will need to
make some edits yourself.

For a full tutorial on BLT, please see: https://llnl-blt.readthedocs.io/en/develop/tutorial/index.html
For more information on BLT, please refer to the [BLT User Guide and Tutorial](https://llnl-blt.readthedocs.io/en/develop/tutorial/index.html).

## Building the Lessons

We have already run CMake for you in this container to generate the make-based
build system. So now you can compile and run the first lesson.
We have already run CMake for you in the container used for this tutorial
to generate the make-based build system. So you are ready to compile and run
the first lesson.

First, open the VSCode terminal (Shift + ^ + `), and then move to the
build directory:
Expand All @@ -49,7 +52,7 @@ $ cd build
```

Compiling your project in a different directory than the source code is a best
practice when using CMake. Once you are in the build directory, you can use the
practice when using CMake. Once you are in the build directory, you can use the
`make` command to compile the executable:

```
Expand All @@ -65,5 +68,4 @@ Hello, world!
```

In the next lesson, we will show you how to add RAJA and Umpire as dependencies
to the application.

to an application.
24 changes: 11 additions & 13 deletions Intro_Tutorial/lessons/02_raja_umpire/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,14 @@
# Lesson 2
# Lesson 2: RAJA and Umpire as Build Dependencies

In this lesson, you will learn how to add RAJA and Umpire as dependencies
to your application.

Like the previous lesson, we have a CMakeLists.txt file that will describe our
application using the `blt_add_executable` macro.
RAJA and Umpire are included in this project as **targets** that we tell CMake
our application depends on: [RAJA and Umpire Depend](https://github.com/LLNL/raja-suite-tutorial/blob/main/tpl/CMakeLists.txt).

RAJA and Umpire are included in this project (look at tpl/CMakeLists.txt) and so
they exist as "targets" that we can tell CMake our application depends on.
Additionally, since we have configured this project to use CUDA, BLT provides a
`cuda` target to ensure that executables will be built with CUDA support.

The `blt_add_executable` macro has another argument, `DEPENDS_ON`, that you can
use to list dependencies.
Additionally, we can specify other dependency targets, such as CUDA, in the
`blt_add_executable` macro for our application executable. The macro has
an argument for this, `DEPENDS_ON`, that you can use to list dependencies.

```
blt_add_executable(
Expand All @@ -21,9 +17,11 @@ blt_add_executable(
DEPENDS_ON )
```

Once you have added the dependencies, uncomment out the RAJA and Umpire header
includes in the source code. Then, you can build and run the lesson as
before. As a reminder, open the VSCode terminal (Shift + ^ + `), and then
In the `CMakeLists.txt` file in this lesson, you will find a `TODO:` comment
asking you to add the RAJA, umpire, and cuda dependencies to build the lesson
code. After you have added the dependencies, uncomment the RAJA and Umpire
header file includes in the source code. Then, you can build and run the lesson.
As a reminder, open the VSCode terminal (Shift + ^ + `), and then
move to the build directory:

```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,12 @@
#include "RAJA/RAJA.hpp"
#include "umpire/Umpire.hpp"

// TODO: Uncomment this in order to build!
//#define COMPILE

int main()
{
#if defined(COMPILE)
double* data{nullptr};

// TODO: allocate an array of 100 doubles using the HOST allocator
Expand All @@ -18,5 +22,6 @@ int main()

// TODO: deallocate the array

#endif
return 0;
}
39 changes: 19 additions & 20 deletions Intro_Tutorial/lessons/03_umpire_allocator/README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,46 @@
# Lesson 3
# Lesson 3: Umpire Allocators

In this lesson, you will learn how to use Umpire to allocate memory. The file
`03_umpire_allocator.cpp` contains some `TODO:` comments where you can add code to allocate and
deallocate memory.
`03_umpire_allocator.cpp` contains `TODO:` comments where you will code to
allocate and deallocate memory.

The fundamental concept for accessing memory through Umpire is the
`umpire::Allocator`. An `umpire::Allocator` is a C++ object that can be used to
allocate and deallocate memory, as well as query a pointer to get
information about it. (Note: in this lesson, we will see how to query the name of the Allocator!)
information about it. In this lesson, we will see how to query the name of an Allocator.

All `umpire::Allocator` objects are created and managed by Umpire’s
`umpire::ResourceManager`. To create an allocator, first obtain a handle to the
ResourceManager, and then request the Allocator corresponding to the desired
memory resource using the `getAllocator` function:
All `umpire::Allocator` objects are created and managed by the
`umpire::ResourceManager` *Singleton* object. To create an allocator,
first obtain a handle to the ResourceManager, and then request the Allocator
corresponding to the desired memory resource using the `getAllocator` function:

```
auto& rm = umpire::ResourceManager::getInstance();
auto allocator = rm.getAllocator("HOST");
```

The Allocator class provides methods for allocating and deallocating memory. You
can view these methods in the Umpire source code documentation here:
https://umpire.readthedocs.io/en/develop/doxygen/html/classumpire_1_1Allocator.html
can view these methods in the [Umpire AllocatorInterface](https://umpire.readthedocs.io/en/develop/doxygen/html/classumpire_1_1Allocator.html).

To use an Umpire allocator, use the following code, replacing "size in bytes" with
the desired size for your allocation:
To use an Umpire allocator, use the following code, replacing "size in bytes"
with the desired size for your allocation:

```
void* memory = allocator.allocate(size in bytes);
```

Moving and modifying data in a heterogenous memory system can be annoying since you
have to keep track of the source and destination, and often use vendor-specific APIs
to perform the modifications. In Umpire, all data modification and movement, regardless
of memory resource or platform, is done using Operations.
Moving and modifying data in a heterogenous memory system can be subtle
because you have to keep track of the source and destination memory spaces,
and often use vendor-specific APIs to perform the modifications. In Umpire,
all data modification and movement, regardless of memory resource or platform,
is done using **Umpire Operations**.

Next, we will use the `memset` Operator provided by Umpire's Resource Manager to
set the memory we just allocated to zero.
Next, we will use the `memset` Operator provided by Umpire's Resource Manager
to set the memory we just allocated to zero.

Don't forget to deallocate your memory afterwards!

For more details, you can check out the Umpire documentation:
https://umpire.readthedocs.io/en/develop/sphinx/tutorial/allocators.html
For more details, you can check out the [Umpire Allocator Documentation](https://umpire.readthedocs.io/en/develop/sphinx/tutorial/allocators.html).

Once you have made your changes, you can compile and run the lesson:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,12 @@
#include "RAJA/RAJA.hpp"
#include "umpire/Umpire.hpp"

// TODO: Uncomment this in order to build!
#define COMPILE

int main()
{
#if defined(COMPILE)
double* data{nullptr};

// TODO: allocate an array of 100 doubles using the HOST allocator
Expand All @@ -23,5 +27,6 @@ int main()
// TODO: deallocate the array
allocator.deallocate(data);

#endif
return 0;
}
2 changes: 1 addition & 1 deletion Intro_Tutorial/lessons/04_raja_forall/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Lesson Four
# Lesson 4: RAJA Simple Loops

Data parallel kernels are common in many parallel HPC applications. In a data
parallel loop kernel, the processing of data that occurs at each iterate **is
Expand Down
2 changes: 1 addition & 1 deletion Intro_Tutorial/lessons/05_raja_reduce/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Lesson 5
# Lesson 5: RAJA Reductions

In lesson 4, we looked at a data parallel loop kernel in which each loop
iterate was independent of the others. In this lesson, we consider a kernel
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Lesson 6
# Lesson 6: Host-Device Memory and Device Kernels

Now, let's learn about Umpire's different memory resources and, in
particular, those used to allocate memory on a GPU.
Expand Down
8 changes: 7 additions & 1 deletion Intro_Tutorial/lessons/07_raja_algs/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Lesson 07
# Lesson 7: RAJA Algorithms

So far, we've looked at RAJA kernel launch methods, where a user passes a kernel
body that defines what an algorithm does at each iterate. RAJA provides
Expand Down Expand Up @@ -87,6 +87,9 @@ $ make 07_raja_atomic
$ .bin/07_raja_atomic
```

Additional information about RAJA atomic operation support can be found in
[RAJA Atomic Operations](https://raja.readthedocs.io/en/develop/sphinx/user_guide/tutorial/atomic_histogram.html).

## Parallel Scan

A **scan operation** is an important building block for parallel algorithms. It
Expand Down Expand Up @@ -204,3 +207,6 @@ $ .bin/07_raja_scan

Is the result what you expected it to be? Can you explain why the first value
in the output is what it is?

Additional information about RAJA scan operations can be found in
[RAJA Parallel Scan Operations](https://raja.readthedocs.io/en/develop/sphinx/user_guide/tutorial/scan.html).
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

#include "RAJA/RAJA.hpp"
#include "umpire/Umpire.hpp"
// TODO: include the header file for the Umpire QuickPool strategy so you can
// use it in the code below

//Uncomment to compile
//#define COMPILE
Expand Down
45 changes: 29 additions & 16 deletions Intro_Tutorial/lessons/08_raja_umpire_quick_pool/README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,46 @@
# Lesson 8
# Lesson 8: Umpire Memory Pools

In this lesson, you will learn to create a memory pool using Umpire.
In this lesson, you will learn to create and use an Umpire memory pool.

Frequently allocating and deallocating memory can be quite costly, especially when you are making large allocations or allocating on different memory resources.
Memory pools are a more efficient way to allocate large amounts of memory, especially when dealing with HPC environments.
Frequently allocating and deallocating memory can be quite costly, especially
when you are making large allocations or allocating on different memory
resources. Memory pools are a more efficient way to allocate large amounts of
memory, especially in HPC environments.

Additionally, Umpire provides allocation strategies that can be used to customize how data is obtained from the system.
In this lesson, we will learn about one such strategy called `QuickPool`.
Umpire provides **allocation strategies** that can be used to customize how
data is obtained from the system. In this lesson, we will learn about one such
strategy called `QuickPool`.

The `QuickPool` strategy describes a certain type of pooling algorithm provided in the Umpire API.
As its name suggests, `QuickPool` has been shown to be performant for many use cases.
The `QuickPool` strategy describes a certain type of pooling algorithm provided
by Umpire. As its name suggests, `QuickPool` is performant for many use cases.

Umpire also provides other types of pooling strategies such as `DynamicPoolList` and `FixedPool`.
You can visit the documentation to learn more: https://umpire.readthedocs.io/en/develop/index.html
Umpire also provides other types of pooling strategies such as `DynamicPoolList`
and `FixedPool`. More information about Umpire memory pools and other features
is available in the [Umpire User Guide](https://umpire.readthedocs.io/en/develop/index.html).

To create a new memory pool allocator using the `QuickPool` strategy, we can use the `ResourceManager`:
To create a new memory pool allocator using the `QuickPool` strategy, we use
the `ResourceManager`:
```
umpire::Allocator pool = rm.makeAllocator<umpire::strategy::QuickPool>("pool_name", my_allocator);
```

This newly created `pool` is an `umpire::Allocator` using the `QuickPool` strategy. As you can see above, we can use the `ResourceManager::makeAllocator` function to create the pool allocator. We just need to pass
in: (1) the name we would like the pool to have, and (2) the allocator we previously created with the `ResourceManager` (see line 17 in the
file `08_raja_umpire_quick_pool.cpp`). Remember that you will also need to include the `umpire/strategy/QuickPool.hpp` header file.

There are other arguments that could be passed to the pool constructor if needed. These additional option arguments are a bit advanced and are beyond the scope of this tutorial. However, you can visit the documentation page for more: https://umpire.readthedocs.io/en/develop/doxygen/html/index.html
This newly created `pool` is an `umpire::Allocator` that uses the `QuickPool`
allocation strategy. In the code example above, we call the
`ResourceManager::makeAllocator` function to create the pool allocator. We
pass in: (1) the name we choose for the the pool, and (2) an allocator we
previously created with the `ResourceManager`. Note that you will need to
include the Umpire header file for the pool type you wish to use, in this case
```
#include "umpire/strategy/QuickPool.hpp"
```

When you have created your QuickPool allocator, uncomment the COMPILE define on line 7;
then compile and run the code:
```
$ make 08_raja_umpire_quick_pool
$ ./bin/08_raja_umpire_quick_pool
```

Other arguments can be passed to the pool constructor if needed. However, they
are beyond the scope of this tutorial. Please visit the [Umpire User Guide](https://umpire.readthedocs.io/en/develop/index.html) to learn more.

Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@

#include "RAJA/RAJA.hpp"
#include "umpire/Umpire.hpp"
// TODO: include the header file for the Umpire QuickPool strategy so you can
// use it in the code below
#include "umpire/strategy/QuickPool.hpp"

int main()
Expand Down
2 changes: 1 addition & 1 deletion Intro_Tutorial/lessons/09_raja_view/09_raja_view.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
// TODO: Uncomment this in order to build!
//#define COMPILE

// Method to print arrays associated with the Views constructed above
// Method to print arrays associated with the Views in the lesson
void printArrayAsMatrix( double * array, int row, int col )
{
for ( int ii = 0; ii < row * col; ++ii )
Expand Down
Loading