Skip to content

Commit

Permalink
[Docs ]Update ai runtime management api and downloader docs (#577)
Browse files Browse the repository at this point in the history
  • Loading branch information
Jeffwan authored Jan 20, 2025
1 parent e42d591 commit 857aa5f
Show file tree
Hide file tree
Showing 2 changed files with 23 additions and 10 deletions.
31 changes: 22 additions & 9 deletions docs/source/features/runtime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ The AI Runtime hides various implementation details on the inference engine side
``python3 -m pip install aibrix``



Metric Standardization
----------------------
Different inference engines will expose different metrics, and AI Runtime will standardize them.
Expand All @@ -40,7 +39,7 @@ First Define the necessary environment variables for the HuggingFace model.
.. code-block:: bash
# General settings
export DOWNLOADER_ALLOW_FILE_SUFFIX=json, safetensors
export DOWNLOADER_ALLOW_FILE_SUFFIX="json, safetensors"
export DOWNLOADER_NUM_THREADS=16
# HuggingFace settings
export HF_ENDPOINT=https://hf-mirror.com # set it when env is in CN
Expand All @@ -62,13 +61,13 @@ First Define the necessary environment variables for the S3 model.
.. code-block:: bash
# General settings
export DOWNLOADER_ALLOW_FILE_SUFFIX=json, safetensors
export DOWNLOADER_ALLOW_FILE_SUFFIX="json, safetensors"
export DOWNLOADER_NUM_THREADS=16
# AWS settings
export AWS_ACCESS_KEY_ID=<INPUT YOUR AWS ACCESS KEY ID>
export AWS_SECRET_ACCESS_KEY=<INPUT YOUR AWS SECRET ACCESS KEY>
export AWS_ENDPOINT_URL=<INPUT YOUR AWS ENDPOINT URL>
export AWS_REGION=<INPUT YOUR AWS REGION>
export AWS_ENDPOINT_URL=<INPUT YOUR AWS ENDPOINT URL> # e.g. https://s3.us-west-2.amazonaws.com
export AWS_REGION=<INPUT YOUR AWS REGION> # e.g. us-west-2
Then use AI Runtime to download the model from AWS S3:
Expand All @@ -87,13 +86,13 @@ First Define the necessary environment variables for the TOS model.
.. code-block:: bash
# General settings
export DOWNLOADER_ALLOW_FILE_SUFFIX=json, safetensors
export DOWNLOADER_ALLOW_FILE_SUFFIX="json, safetensors"
export DOWNLOADER_NUM_THREADS=16
# AWS settings
export TOS_ACCESS_KEY=<INPUT YOUR TOS ACCESS KEY>
export TOS_SECRET_KEY=<INPUT YOUR TOS SECRET KEY>
export TOS_ENDPOINT=<INPUT YOUR TOS ENDPOINT>
export TOS_REGION=<INPUT YOUR TOS REGION>
export TOS_ENDPOINT=<INPUT YOUR TOS ENDPOINT> # e.g. https://tos-s3-cn-beijing.volces.com
export TOS_REGION=<INPUT YOUR TOS REGION> # e..g cn-beijing
Then use AI Runtime to download the model from TOS:
Expand All @@ -103,6 +102,20 @@ Then use AI Runtime to download the model from TOS:
python -m aibrix.downloader \
--model-uri tos://aibrix-model-artifacts/deepseek-coder-6.7b-instruct/ \
--local-dir /tmp/aibrix/models_tos/
Model Management API
^^^^^^^^^^^^^^^^^^^^

.. attention::
this needs the engine to starts with `--enable-lora` and env `export VLLM_ALLOW_RUNTIME_LORA_UPDATING=true` enabled.

.. code-block:: bash
curl -X POST http://localhost:8080/v1/lora_adapter/load \
-H "Content-Type: application/json" \
-d '{"lora_name": "lora-1", "lora_path": "bharati2324/Qwen2.5-1.5B-Instruct-Code-LoRA-r16v2"}'
.. code-block:: bash
curl -X POST http://localhost:8080/v1/lora_adapter/unload \
-H "Content-Type: application/json" \
-d '{"lora_name": "lora-1"}'
2 changes: 1 addition & 1 deletion python/aibrix/aibrix/downloader/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ def _get_auth_config(self) -> Dict[str, Optional[str]]:
region_name: "region-name",
endpoint_url: "URL_ADDRESS3.region-name.com",
aws_access_key_id: "AK****",
aws_secret_access_key: "SK****",,
aws_secret_access_key: "SK****",
}
"""
pass
Expand Down

0 comments on commit 857aa5f

Please sign in to comment.