Intel AMX is a built-in component of recent Intel CPU architectures, first supported by the Intel Sapphire Rapids in 2023, that enables efficient dense matrix multiplications using mixed precision with low-precision data types. The popularity of mixed-precision algorithms has grown recently, primarily due to their use on GPUs to enhance the efficiency of machine learning algorithms, particularly for neural network training. The availability of mixed precision on CPUs represents a cost-effective solution for applications where high speed is not critical. The code samples in this repository show how to use the Intel AMX accelerator through examples in C++ and Python. The examples will focus on mixed-precision floating-point operations obtained by the use of bfloat16 (or BF16) to accelerate code in single precision. We employ a bottom-up methodology, starting from specific register instructions (TMUL operation) to higher-level applications in libraries such as Intel MKL, PyTorch, and TensorFlow, ensuring a comprehensive understanding of the accelerator's potential. Additionally, we provide insights into the expected performance gains when leveraging the accelerator on the Kestrel HPC machine at the National Renewable Energy Laboratory.
.
├── LICENSE # License for this project
├── README.md # This file
├── amx-tmul-performance # Example using the low-level C/C++ API provided by Intel
├── mkl-using-amx # Example using Intel AMX
├── notebooks # Notebooks with comprehensive examples for PyTorch and TensorFlow
│ ├── infer_bf16_pytorch.ipynb # Mixed precision on inference of neural networks
│ ├── matmul_pytorch.ipynb # Mixed precision on matrix multiplications in PyTorch
│ ├── matmul_tensorflow.ipynb # Mixed precision on matrix multiplications in TensorFlow
│ └── train_bf16_pytorch.ipynb # Mixed precision on training of neural networks
└── trmm-in-tlapack # Triangular matrix-matrix multiplication in mixed precisionThis project is licensed under the BSD 3-Clause license. See the LICENSE file for details.
NREL Software Record number: SWR-25-81