Skip to content

Production update #2005

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 50 commits into from
Jun 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
ff41b77
automatic update of stats files
May 19, 2025
c332db8
Merge branch 'ArmDeveloperEcosystem:main' into main
NinaARM May 20, 2025
1312e6d
Merge branch 'ArmDeveloperEcosystem:main' into main
NinaARM May 22, 2025
f21ed16
add macOS build instructions for example application
NinaARM May 22, 2025
738f61c
update for macOS output file
NinaARM May 22, 2025
d2fb543
automatic update of stats files
May 26, 2025
51249f6
fix typo
NinaARM May 26, 2025
3fc7809
updates for prerequisites and converting model pages
NinaARM May 26, 2025
9ecf92f
add spiece model to models directory instead of workspace
NinaARM May 26, 2025
b9ba671
updates for liteRT build
NinaARM May 27, 2025
863deb9
add contributors
NinaARM May 27, 2025
e8e16bf
more updates
NinaARM May 27, 2025
96a35f2
Merge branch 'ArmDeveloperEcosystem:main' into main
madeline-underwood May 27, 2025
12a963f
Merge branch 'main' into documentation-updates
pareenaverma May 27, 2025
ce131c7
Merge pull request #1987 from NinaARM/documentation-updates
pareenaverma May 27, 2025
1e33f3d
Merge branch 'ArmDeveloperEcosystem:main' into main
madeline-underwood May 28, 2025
5bb7b87
updates
madeline-underwood May 28, 2025
3eb8449
updates
madeline-underwood May 28, 2025
516a33d
Updates
madeline-underwood May 28, 2025
54dc9f8
Updates
madeline-underwood May 28, 2025
6c97b4c
Add learning path for vertex efficiency
andrewkilroyarm May 3, 2025
53b2745
Updates
madeline-underwood May 29, 2025
3007b1b
Merge pull request #1993 from andrewkilroyarm/main
pareenaverma May 29, 2025
4696bb2
updates
madeline-underwood May 29, 2025
cc78c8e
Update _index.md
pareenaverma May 29, 2025
432e265
Merge pull request #1995 from pareenaverma/content_review
pareenaverma May 29, 2025
9b82829
Merge branch 'ArmDeveloperEcosystem:main' into page_size
madeline-underwood May 29, 2025
d7c7d8a
Updates
madeline-underwood May 29, 2025
12c0001
Update index.md
madeline-underwood May 30, 2025
a0a9b5a
updates
madeline-underwood May 30, 2025
107d3a7
updates
madeline-underwood May 30, 2025
b51bac0
Merge pull request #1994 from madeline-underwood/cache
jasonrandrews May 30, 2025
2e125f6
VME LP review
pareenaverma May 30, 2025
d7e1957
Merge branch 'content_review' of https://github.com/pareenaverma/arm-…
pareenaverma May 30, 2025
5df4249
Merge pull request #1997 from pareenaverma/content_review
pareenaverma May 30, 2025
33b766f
Docker Model Runner Learning Path
jasonrandrews May 30, 2025
8757036
Merge pull request #1998 from jasonrandrews/review2
jasonrandrews May 30, 2025
5eed18e
Merge branch 'ArmDeveloperEcosystem:main' into page_size
madeline-underwood Jun 1, 2025
187e9dc
automatic update of stats files
Jun 2, 2025
6bbfe34
Merge branch 'ArmDeveloperEcosystem:main' into page_size
madeline-underwood Jun 2, 2025
d08cb81
Fixed index and overview
madeline-underwood Jun 2, 2025
c038564
Add Hugging Face tag to Learning Paths at https://huggingface.co/Arm
jasonrandrews Jun 2, 2025
35f642c
Merge pull request #2000 from jasonrandrews/review
jasonrandrews Jun 2, 2025
53548ab
Content review
madeline-underwood Jun 2, 2025
23f1b88
Merge pull request #2001 from madeline-underwood/page_size
jasonrandrews Jun 2, 2025
37416f5
spelling updates
jasonrandrews Jun 2, 2025
926a163
spelling updates
jasonrandrews Jun 2, 2025
9c03b8c
Merge pull request #2002 from jasonrandrews/spelling
jasonrandrews Jun 2, 2025
3646ade
terminology fixes
madeline-underwood Jun 2, 2025
1f5bb56
Merge pull request #2004 from madeline-underwood/perf-naming-changes
jasonrandrews Jun 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4184,4 +4184,20 @@ subgenre
submodule
subword
techcrunch
transformative
transformative
Aude
Gian
Iodice
SmolLM
VME
Vuilliomenet
cpus
fLO
invalidations
libtensorflowlite
macos
multithreaded
Wix's
ngrok's
qs
qu
6 changes: 5 additions & 1 deletion assets/contributors.csv
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,9 @@ Yiyang Fan,Arm,,,,
Julien Jayat,Arm,,,,
Geremy Cohen,Arm,geremyCohen,geremyinanutshell,,
Barbara Corriero,Arm,,,,
Nina Drozd,Arm,,ninadrozd,,
Nina Drozd,Arm,NinaARM,ninadrozd,,
Jun He,Arm,JunHe77,jun-he-91969822,,
Gian Marco Iodice,Arm,,,,
Aude Vuilliomenet,Arm,,,,
Andrew Kilroy,Arm,,,,
Peter Harris,Arm,,,,
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The Model Context Protocol (MCP) is an open specification designed to connect La

- **Security by design:** MCP encourages running servers inside your own infrastructure, so sensitive data stays within your infrastructure unless explicitly shared.

- **Cross-ecosystem momentum:** recent roll-outs from an official C# SDK to Wixs production MCP server and Microsoft’s Azure support show the MCP spec is gathering real-world traction.
- **Cross-ecosystem momentum:** recent roll-outs from an official C# SDK to Wix's production MCP server and Microsoft’s Azure support show the MCP spec is gathering real-world traction.

## What is uv?

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

You will now use ngrok to expose your locally running MCP server to the public internet over HTTPS.

1. Add ngroks repo to the apt package manager and install:
1. Add ngrok's repo to the apt package manager and install:
```bash
curl -sSL https://ngrok-agent.s3.amazonaws.com/ngrok.asc \
| sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ tools_software_languages_filter:
- GitHub: 3
- GitLab: 1
- Himax SDK: 1
- Hugging Face: 3
- IP Explorer: 4
- Jupyter Notebook: 1
- K3s: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ tools_software_languages:
- GenAI
- Raspberry Pi
- Python

- Hugging Face

further_reading:
- resource:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ tools_software_languages:
- LLM
- GenAI
- Raspberry Pi
- Hugging Face



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@ armips:
tools_software_languages:
- Himax SDK
- Python
- Hugging Face

operatingsystems:
- Linux
- macOS
Expand Down
11 changes: 6 additions & 5 deletions content/learning-paths/laptops-and-desktops/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@ operatingsystems_filter:
- Android: 2
- ChromeOS: 1
- Linux: 31
- macOS: 7
- Windows: 43
- macOS: 8
- Windows: 44
subjects_filter:
- CI-CD: 5
- Containers and Virtualization: 5
- Containers and Virtualization: 6
- Migration to Arm: 28
- ML: 2
- Performance and Architecture: 25
Expand All @@ -39,7 +39,7 @@ tools_software_languages_filter:
- Coding: 16
- CSS: 1
- Daytona: 1
- Docker: 4
- Docker: 5
- GCC: 10
- Git: 1
- GitHub: 3
Expand All @@ -52,6 +52,7 @@ tools_software_languages_filter:
- JavaScript: 2
- Kubernetes: 1
- Linux: 1
- LLM: 1
- LLVM: 1
- llvm-mca: 1
- MSBuild: 1
Expand All @@ -62,7 +63,7 @@ tools_software_languages_filter:
- ONNX Runtime: 1
- OpenCV: 1
- perf: 4
- Python: 5
- Python: 6
- Qt: 2
- Remote.It: 1
- RME: 1
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
title: Learn how to use Docker Model Runner in AI applications

draft: true
cascade:
draft: true

minutes_to_complete: 45

who_is_this_for: This is for software developers and AI enthusiasts who want to run AI models using Docker Model Runner.

learning_objectives:
- Run AI models locally using Docker Model Runner.
- Easily build containerized applications with LLMs.

prerequisites:
- A computer with at least 16GB of RAM (recommended) and Docker Desktop installed (version 4.40 or later).
- Basic understanding of Docker.
- Familiarity with Large Language Model (LLM) concepts.

author: Jason Andrews

### Tags
skilllevels: Introductory
subjects: Containers and Virtualization
armips:
- Neoverse
- Cortex-A
operatingsystems:
- Windows
- macOS
tools_software_languages:
- Docker
- Python
- LLM

further_reading:
- resource:
title: Docker Model Runner Documentation
link: https://docs.docker.com/model-runner/
type: documentation
- resource:
title: Introducing Docker Model Runner
link: https://www.docker.com/blog/introducing-docker-model-runner/
type: blog

### FIXED, DO NOT MODIFY
# ================================================================================
weight: 1 # _index.md always has weight of 1 to order correctly
layout: "learningpathall" # All files under learning paths have this same wrapper
learning_path_main_page: "yes" # This should be surfaced when looking for related content. Only set for _index.md of learning path content.
---

Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
# ================================================================================
# FIXED, DO NOT MODIFY THIS FILE
# ================================================================================
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
title: "Next Steps" # Always the same, html page title.
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
---
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
123 changes: 123 additions & 0 deletions content/learning-paths/laptops-and-desktops/docker-models/compose.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
title: "Run a containerized AI chat app with Docker Compose"
weight: 3
layout: "learningpathall"
---

Docker Compose makes it easy to run multi-container applications. Docker Compose can also include AI models in your project.

In this section, you'll learn how to use Docker Compose to deploy a web-based AI chat application that uses Docker Model Runner as the backend for AI inference.

## Clone the example project

The example project, named [docker-model-runner-chat](https://github.com/jasonrandrews/docker-model-runner-chat) is available on GitHub. It provides a simple web interface to interact with local AI models such as Llama 3.2 or Gemma 3.

First, clone the example repository:

```console
git clone https://github.com/jasonrandrews/docker-model-runner-chat.git
cd docker-model-runner-chat
```

## Review the Docker Compose file

The `compose.yaml` file defines how the application is deployed using Docker Compose.

It sets up two services:

- **ai-chat**: A Flask-based web application that provides the chat user interface. It is built from the local directory, exposes port 5000 for browser access, mounts the project directory as a volume for live code updates, loads environment variables from `vars.env`, and waits for the `ai-runner` service to be ready before starting.
- **ai-runner**: This service uses the Docker Model Runner provider to run the selected AI model (for example, `ai/gemma3`). The configuration under `provider` tells Docker to use the model runner extension and specifies which model to load.

The setup allows the web app to communicate with the model runner service as if it were an OpenAI-compatible API, making it easy to swap models or update endpoints by changing environment variables or compose options.

Review the `compose.yaml` file to see the two services.

```yaml
services:
ai-chat:
build:
context: .
ports:
- "5000:5000"
volumes:
- ./:/app
env_file:
- vars.env
depends_on:
- ai-runner
ai-runner:
provider:
type: model
options:
model: ai/gemma3
```

## Start the application

From the project directory, start the app with:

```console
docker compose up --build
```

Docker Compose will build the web app image and start both services.

## Access the chat interface

Open your browser and copy and paste the local URL below:

```console
http://localhost:5000
```

You can now chat with the AI model using the web interface. Enter your prompt and view the response in real time.

![Compose #center](compose-app.png)

## Configuration

You can change the AI model or endpoint by editing the `vars.env` file before starting the containers. The file contains environment variables used by the web application:

- `BASE_URL`: The base URL for the AI model API. By default, it is set to `http://model-runner.docker.internal/engines/v1/`, which allows the web app to communicate with the Docker Model Runner service. This is the default endpoint setup by Docker to access the model.
- `MODEL`: The AI model to use (for example, `ai/gemma3` or `ai/llama3.2`).

The `vars.env` file is shown below.

```console
BASE_URL=http://model-runner.docker.internal/engines/v1/
MODEL=ai/gemma3
```

To use a different model, change the `MODEL` value. For example:

```console
MODEL=ai/llama3.2
```

Make sure to change the model in the `compose.yaml` file also.

You can also change the `temperature` and `max_tokens` values in `app.py` to further customize the application.

## Stop the application

To stop the services, press `Ctrl+C` in the terminal.

You can also run the command below in another terminal to stop the services.

```console
docker compose down
```

## Troubleshooting

Use the steps below if you have any issues running the application:

- Ensure Docker and Docker Compose are installed and running
- Make sure port 5000 is not in use by another application
- Check logs with:

```console
docker compose logs
```

In this section, you learned how to use Docker Compose to run a containerized AI chat application with a web interface and local model inference from Docker Model Runner.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading