Skip to content

Commit 56476fa

Browse files
committed
Merge branch 'main' into allow-dynamic-models-ollama
2 parents c67bae2 + 968fc13 commit 56476fa

File tree

247 files changed

+9258
-7259
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

247 files changed

+9258
-7259
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
name: Setup VLLM
2+
description: Start VLLM
3+
runs:
4+
using: "composite"
5+
steps:
6+
- name: Start VLLM
7+
shell: bash
8+
run: |
9+
# Start vllm container
10+
docker run -d \
11+
--name vllm \
12+
-p 8000:8000 \
13+
--privileged=true \
14+
quay.io/higginsd/vllm-cpu:65393ee064 \
15+
--host 0.0.0.0 \
16+
--port 8000 \
17+
--enable-auto-tool-choice \
18+
--tool-call-parser llama3_json \
19+
--model /root/.cache/Llama-3.2-1B-Instruct \
20+
--served-model-name meta-llama/Llama-3.2-1B-Instruct
21+
22+
# Wait for vllm to be ready
23+
echo "Waiting for vllm to be ready..."
24+
timeout 900 bash -c 'until curl -f http://localhost:8000/health; do
25+
echo "Waiting for vllm..."
26+
sleep 5
27+
done'

.github/dependabot.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,6 @@ updates:
1414
schedule:
1515
interval: "weekly"
1616
day: "saturday"
17-
# ignore all non-security updates: https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file#open-pull-requests-limit
18-
open-pull-requests-limit: 0
1917
labels:
2018
- type/dependencies
2119
- python

.github/workflows/README.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Llama Stack CI
2+
3+
Llama Stack uses GitHub Actions for Continous Integration (CI). Below is a table detailing what CI the project includes and the purpose.
4+
5+
| Name | File | Purpose |
6+
| ---- | ---- | ------- |
7+
| Update Changelog | [changelog.yml](changelog.yml) | Creates PR for updating the CHANGELOG.md |
8+
| Coverage Badge | [coverage-badge.yml](coverage-badge.yml) | Creates PR for updating the code coverage badge |
9+
| Installer CI | [install-script-ci.yml](install-script-ci.yml) | Test the installation script |
10+
| Integration Auth Tests | [integration-auth-tests.yml](integration-auth-tests.yml) | Run the integration test suite with Kubernetes authentication |
11+
| SqlStore Integration Tests | [integration-sql-store-tests.yml](integration-sql-store-tests.yml) | Run the integration test suite with SqlStore |
12+
| Integration Tests | [integration-tests.yml](integration-tests.yml) | Run the integration test suite with Ollama |
13+
| Vector IO Integration Tests | [integration-vector-io-tests.yml](integration-vector-io-tests.yml) | Run the integration test suite with various VectorIO providers |
14+
| Pre-commit | [pre-commit.yml](pre-commit.yml) | Run pre-commit checks |
15+
| Test Llama Stack Build | [providers-build.yml](providers-build.yml) | Test llama stack build |
16+
| Python Package Build Test | [python-build-test.yml](python-build-test.yml) | Test building the llama-stack PyPI project |
17+
| Check semantic PR titles | [semantic-pr.yml](semantic-pr.yml) | Ensure that PR titles follow the conventional commit spec |
18+
| Close stale issues and PRs | [stale_bot.yml](stale_bot.yml) | Run the Stale Bot action |
19+
| Test External Providers Installed via Module | [test-external-provider-module.yml](test-external-provider-module.yml) | Test External Provider installation via Python module |
20+
| Test External API and Providers | [test-external.yml](test-external.yml) | Test the External API and Provider mechanisms |
21+
| Unit Tests | [unit-tests.yml](unit-tests.yml) | Run the unit test suite |
22+
| Update ReadTheDocs | [update-readthedocs.yml](update-readthedocs.yml) | Update the Llama Stack ReadTheDocs site |

.github/workflows/changelog.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: Update Changelog
22

3+
run-name: Creates PR for updating the CHANGELOG.md
4+
35
on:
46
release:
57
types: [published, unpublished, created, edited, deleted, released]

.github/workflows/coverage-badge.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: Coverage Badge
22

3+
run-name: Creates PR for updating the code coverage badge
4+
35
on:
46
push:
57
branches: [ main ]
@@ -15,6 +17,9 @@ on:
1517

1618
jobs:
1719
unit-tests:
20+
permissions:
21+
contents: write # for peter-evans/create-pull-request to create branch
22+
pull-requests: write # for peter-evans/create-pull-request to create a PR
1823
runs-on: ubuntu-latest
1924
steps:
2025
- name: Checkout repository

.github/workflows/install-script-ci.yml

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: Installer CI
22

3+
run-name: Test the installation script
4+
35
on:
46
pull_request:
57
paths:
@@ -17,10 +19,20 @@ jobs:
1719
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # 4.2.2
1820
- name: Run ShellCheck on install.sh
1921
run: shellcheck scripts/install.sh
20-
smoke-test:
21-
needs: lint
22+
smoke-test-on-dev:
2223
runs-on: ubuntu-latest
2324
steps:
24-
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # 4.2.2
25+
- name: Checkout repository
26+
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
27+
28+
- name: Install dependencies
29+
uses: ./.github/actions/setup-runner
30+
31+
- name: Build a single provider
32+
run: |
33+
USE_COPY_NOT_MOUNT=true LLAMA_STACK_DIR=. uv run llama stack build --template starter --image-type container --image-name test
34+
2535
- name: Run installer end-to-end
26-
run: ./scripts/install.sh
36+
run: |
37+
IMAGE_ID=$(docker images --format "{{.Repository}}:{{.Tag}}" | head -n 1)
38+
./scripts/install.sh --image $IMAGE_ID

.github/workflows/integration-auth-tests.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: Integration Auth Tests
22

3+
run-name: Run the integration test suite with Kubernetes authentication
4+
35
on:
46
push:
57
branches: [ main ]

.github/workflows/integration-sql-store-tests.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: SqlStore Integration Tests
22

3+
run-name: Run the integration test suite with SqlStore
4+
35
on:
46
push:
57
branches: [ main ]

.github/workflows/integration-tests.yml

Lines changed: 48 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: Integration Tests
22

3+
run-name: Run the integration test suite with Ollama
4+
35
on:
46
push:
57
branches: [ main ]
@@ -14,13 +16,19 @@ on:
1416
- '.github/workflows/integration-tests.yml' # This workflow
1517
- '.github/actions/setup-ollama/action.yml'
1618
schedule:
17-
- cron: '0 0 * * *' # Daily at 12 AM UTC
19+
# If changing the cron schedule, update the provider in the test-matrix job
20+
- cron: '0 0 * * *' # (test latest client) Daily at 12 AM UTC
21+
- cron: '1 0 * * 0' # (test vllm) Weekly on Sunday at 1 AM UTC
1822
workflow_dispatch:
1923
inputs:
2024
test-all-client-versions:
2125
description: 'Test against both the latest and published versions'
2226
type: boolean
2327
default: false
28+
test-provider:
29+
description: 'Test against a specific provider'
30+
type: string
31+
default: 'ollama'
2432

2533
concurrency:
2634
group: ${{ github.workflow }}-${{ github.ref }}
@@ -53,8 +61,17 @@ jobs:
5361
matrix:
5462
test-type: ${{ fromJson(needs.discover-tests.outputs.test-type) }}
5563
client-type: [library, server]
64+
# Use vllm on weekly schedule, otherwise use test-provider input (defaults to ollama)
65+
provider: ${{ (github.event.schedule == '1 0 * * 0') && fromJSON('["vllm"]') || fromJSON(format('["{0}"]', github.event.inputs.test-provider || 'ollama')) }}
5666
python-version: ["3.12", "3.13"]
57-
client-version: ${{ (github.event_name == 'schedule' || github.event.inputs.test-all-client-versions == 'true') && fromJSON('["published", "latest"]') || fromJSON('["latest"]') }}
67+
client-version: ${{ (github.event.schedule == '0 0 * * 0' || github.event.inputs.test-all-client-versions == 'true') && fromJSON('["published", "latest"]') || fromJSON('["latest"]') }}
68+
exclude: # TODO: look into why these tests are failing and fix them
69+
- provider: vllm
70+
test-type: safety
71+
- provider: vllm
72+
test-type: post_training
73+
- provider: vllm
74+
test-type: tool_runtime
5875

5976
steps:
6077
- name: Checkout repository
@@ -67,8 +84,13 @@ jobs:
6784
client-version: ${{ matrix.client-version }}
6885

6986
- name: Setup ollama
87+
if: ${{ matrix.provider == 'ollama' }}
7088
uses: ./.github/actions/setup-ollama
7189

90+
- name: Setup vllm
91+
if: ${{ matrix.provider == 'vllm' }}
92+
uses: ./.github/actions/setup-vllm
93+
7294
- name: Build Llama Stack
7395
run: |
7496
uv run llama stack build --template ci-tests --image-type venv
@@ -81,10 +103,6 @@ jobs:
81103
82104
- name: Run Integration Tests
83105
env:
84-
OLLAMA_INFERENCE_MODEL: "llama3.2:3b-instruct-fp16" # for server tests
85-
ENABLE_OLLAMA: "ollama" # for server tests
86-
OLLAMA_URL: "http://0.0.0.0:11434"
87-
SAFETY_MODEL: "llama-guard3:1b"
88106
LLAMA_STACK_CLIENT_TIMEOUT: "300" # Increased timeout for eval operations
89107
# Use 'shell' to get pipefail behavior
90108
# https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#exit-codes-and-error-action-preference
@@ -96,12 +114,27 @@ jobs:
96114
else
97115
stack_config="server:ci-tests"
98116
fi
117+
118+
EXCLUDE_TESTS="builtin_tool or safety_with_image or code_interpreter or test_rag"
119+
if [ "${{ matrix.provider }}" == "ollama" ]; then
120+
export OLLAMA_URL="http://0.0.0.0:11434"
121+
export TEXT_MODEL=ollama/llama3.2:3b-instruct-fp16
122+
export SAFETY_MODEL="ollama/llama-guard3:1b"
123+
EXTRA_PARAMS="--safety-shield=llama-guard"
124+
else
125+
export VLLM_URL="http://localhost:8000/v1"
126+
export TEXT_MODEL=vllm/meta-llama/Llama-3.2-1B-Instruct
127+
# TODO: remove the not(test_inference_store_tool_calls) once we can get the tool called consistently
128+
EXTRA_PARAMS=
129+
EXCLUDE_TESTS="${EXCLUDE_TESTS} or test_inference_store_tool_calls"
130+
fi
131+
132+
99133
uv run pytest -s -v tests/integration/${{ matrix.test-type }} --stack-config=${stack_config} \
100-
-k "not(builtin_tool or safety_with_image or code_interpreter or test_rag)" \
101-
--text-model="ollama/llama3.2:3b-instruct-fp16" \
102-
--embedding-model=all-MiniLM-L6-v2 \
103-
--safety-shield=$SAFETY_MODEL \
104-
--color=yes \
134+
-k "not( ${EXCLUDE_TESTS} )" \
135+
--text-model=$TEXT_MODEL \
136+
--embedding-model=sentence-transformers/all-MiniLM-L6-v2 \
137+
--color=yes ${EXTRA_PARAMS} \
105138
--capture=tee-sys | tee pytest-${{ matrix.test-type }}.log
106139
107140
- name: Check Storage and Memory Available After Tests
@@ -110,16 +143,17 @@ jobs:
110143
free -h
111144
df -h
112145
113-
- name: Write ollama logs to file
146+
- name: Write inference logs to file
114147
if: ${{ always() }}
115148
run: |
116-
sudo docker logs ollama > ollama.log
149+
sudo docker logs ollama > ollama.log || true
150+
sudo docker logs vllm > vllm.log || true
117151
118152
- name: Upload all logs to artifacts
119153
if: ${{ always() }}
120154
uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4.6.2
121155
with:
122-
name: logs-${{ github.run_id }}-${{ github.run_attempt }}-${{ matrix.client-type }}-${{ matrix.test-type }}-${{ matrix.python-version }}-${{ matrix.client-version }}
156+
name: logs-${{ github.run_id }}-${{ github.run_attempt }}-${{ matrix.provider }}-${{ matrix.client-type }}-${{ matrix.test-type }}-${{ matrix.python-version }}-${{ matrix.client-version }}
123157
path: |
124158
*.log
125159
retention-days: 1

.github/workflows/integration-vector-io-tests.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
name: Vector IO Integration Tests
22

3+
run-name: Run the integration test suite with various VectorIO providers
4+
35
on:
46
push:
57
branches: [ main ]
@@ -114,7 +116,7 @@ jobs:
114116
run: |
115117
uv run pytest -sv --stack-config="inference=inline::sentence-transformers,vector_io=${{ matrix.vector-io-provider }}" \
116118
tests/integration/vector_io \
117-
--embedding-model all-MiniLM-L6-v2
119+
--embedding-model sentence-transformers/all-MiniLM-L6-v2
118120
119121
- name: Check Storage and Memory Available After Tests
120122
if: ${{ always() }}

0 commit comments

Comments
 (0)