Skip to content

Commit 811e0f9

Browse files
authored
Merge pull request #2 from xorbitsai/docs
feat: Add Sphinx documentation
2 parents ffa6083 + 9643f08 commit 811e0f9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+7013
-0
lines changed

docs/Makefile

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Minimal makefile for Sphinx documentation
2+
#
3+
4+
# You can set these variables from the command line, and also
5+
# from the environment for the first two.
6+
SPHINXOPTS ?=
7+
SPHINXBUILD ?= sphinx-build
8+
SPHINXINTL ?= sphinx-intl
9+
SOURCEDIR = source
10+
BUILDDIR = build
11+
12+
# the i18n builder cannot share the environment and doctrees with the others
13+
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) $(SOURCEDIR)
14+
I18NSPHINXLANGS = -l zh_CN
15+
16+
# Put it first so that "make" without argument is like "make help".
17+
help:
18+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
19+
20+
.PHONY: help Makefile html_zh_cn html_ja_jp gettext
21+
22+
html_zh_cn:
23+
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) -t zh_cn -D language='zh_CN' "$(SOURCEDIR)" $(BUILDDIR)/html_zh_cn
24+
25+
gettext:
26+
$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
27+
$(SPHINXINTL) update -p $(BUILDDIR)/locale $(I18NSPHINXLANGS)
28+
python $(SOURCEDIR)/norm_zh.py
29+
30+
# Catch-all target: route all unknown targets to Sphinx using the new
31+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
32+
%: Makefile
33+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

docs/make.bat

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
@ECHO OFF
2+
3+
pushd %~dp0
4+
5+
REM Command file for Sphinx documentation
6+
7+
if "%SPHINXBUILD%" == "" (
8+
set SPHINXBUILD=sphinx-build
9+
)
10+
set SOURCEDIR=source
11+
set BUILDDIR=source/_build
12+
13+
%SPHINXBUILD% >NUL 2>NUL
14+
if errorlevel 9009 (
15+
echo.
16+
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
17+
echo.installed, then set the SPHINXBUILD environment variable to point
18+
echo.to the full path of the 'sphinx-build' executable. Alternatively you
19+
echo.may add the Sphinx directory to PATH.
20+
echo.
21+
echo.If you don't have Sphinx installed, grab it from
22+
echo.https://www.sphinx-doc.org/
23+
exit /b 1
24+
)
25+
26+
REM Check if the first parameter is provided
27+
if "%1" == "" goto help
28+
29+
REM Initialize command parameters
30+
set CMD_PARAMS=%*
31+
32+
REM Add -D language='zh_CN' if the first parameter is provided (i.e., if %1 is not empty)
33+
set CMD_PARAMS=%CMD_PARAMS% -D language='zh_CN'
34+
35+
%SPHINXBUILD% -M %CMD_PARAMS% %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
36+
goto end
37+
38+
:help
39+
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
40+
41+
:end
42+
popd

docs/source/_build/.buildinfo

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# Sphinx build info version 1
2+
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
3+
config: b78d9368253bdee29cbc70295b940666
4+
tags: 645f666f9bcd5a90fca523b33c5a78b7

docs/source/_build/.nojekyll

Whitespace-only changes.
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
.. _environments:
2+
3+
======================
4+
Environments Requirement
5+
======================
6+
7+
8+
Recommended Systems
9+
~~~~~~~~~~~~~~~~~~~~
10+
11+
Xinference supports the following operating systems:
12+
13+
- **Ubuntu 20.04 / 22.04** (Recommended)
14+
- **CentOS 7 / Rocky Linux 8**
15+
- **Windows 10/11 with WSL2**
16+
17+
18+
Recommended CUDA
19+
~~~~~~~~~~~~~~~~~~~~
20+
21+
Xinference Recommended the following CUDA version:
22+
23+
- **Driver Version 550.127.08** - `Download Driver <https://www.nvidia.cn/drivers/lookup/>`_
24+
- **CUDA Version 12.4** - `Download CUDA <https://developer.nvidia.com/cuda-12-4-0-download-archive>`_
25+
26+
27+
Recommended Docker
28+
~~~~~~~~~~~~~~~~~~~~
29+
30+
Here are the recommended Docker versions for different environments:
31+
32+
- Docker >= 19.03 (recommended, but some distributions may include older versions of Docker. The minimum supported version is 1.12)
33+
34+
- `How to install Docker <https://docs.docker.com/engine/install/>`_
35+
36+
- NVIDIA Container Toolkit >= 1.7.0
37+
38+
- `How to install NVIDIA Container Toolkit <https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/1.13.5/install-guide.html#install-guide>`_
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
.. _getting_started_index:
2+
3+
===============
4+
Getting Started
5+
===============
6+
7+
8+
.. toctree::
9+
:maxdepth: 2
10+
11+
installation
12+
using_xinference
13+
logging
14+
using_docker_image
15+
using_kubernetes
16+
troubleshooting
17+
environments
Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
.. _installation:
2+
3+
============
4+
Installation
5+
============
6+
7+
Xinference can be installed with ``docker`` on Nvidia, NPU, GCU, and DCU.To run models using Xinference, you will need to pull the image corresponding to the type of device you intend to serve.
8+
9+
10+
11+
Nvidia
12+
-------------------
13+
14+
To pull the Nvidia image, run the following command:
15+
16+
.. code-block:: bash
17+
18+
docker login [email protected] registry.cn-hangzhou.aliyuncs.com
19+
Password: cre.uwd3nyn4UDM6fzm
20+
docker pull registry.cn-hangzhou.aliyuncs.com/xinference-prod/xinference-prod:0.0.10-nvidia
21+
22+
23+
Run Command Example
24+
^^^^^^^^^^^^^^^^^^^
25+
26+
To run the container, use the following command:
27+
28+
.. code-block:: bash
29+
30+
docker run -it \
31+
--name Xinf \
32+
--network host \
33+
--gpus all \
34+
--restart unless-stopped \
35+
-v </your/home/path>/.xinference:/root/.xinference \
36+
-v </your/home/path>/.cache/huggingface:/root/.cache/huggingface \
37+
-v </your/home/path>/.cache/modelscope:/root/.cache/modelscope \
38+
registry.cn-hangzhou.aliyuncs.com/xinference-prod/xinference-prod:0.0.10-nvidia /bin/bash
39+
40+
Start Xinference
41+
^^^^^^^^^^^^^^^^^^^
42+
43+
After starting the container, navigate to the `/opt/projects` directory inside the container and run the following command:
44+
45+
.. code-block:: bash
46+
47+
./xinf-enterprise.sh --host 192.168.10.197 --port 9997 && \
48+
XINFERENCE_MODEL_SRC=modelscope xinference-local --host 192.168.10.197 --port 9997 --log-level debug
49+
50+
The `./xinf-enterprise.sh` script is used to start the Nginx service and write the Xinf service startup address to the configuration file.
51+
52+
The Xinf service startup command can be adjusted according to actual requirements. The `host` and `port` should be adjusted according to your device's configuration.
53+
54+
Once the Xinf service is started, you can access the Xinf WebUI interface by visiting port 8000.
55+
56+
MindIE Series
57+
-------------------
58+
59+
Version Information
60+
^^^^^^^^^^^^^^^^^^^
61+
- Python Version: 3.10
62+
- CANN Version: 8.0.rc2
63+
- Operating System Version: ubuntu_22.04
64+
- mindie_1.0.RC2
65+
66+
67+
Dependencies
68+
^^^^^^^^^^^^^^^^^^^
69+
For 310I DUO:
70+
- Driver: Ascend-hdk-310p-npu-driver_24.1.rc2_linux-aarch64.run - `Download <https://obs-whaicc-fae-public.obs.cn-central-221.ovaijisuan.com/cann/mindie/1.0.RC2/310p/Ascend-hdk-310p-npu-driver_24.1.rc2_linux-aarch64.run>`_
71+
- Firmware: Ascend-hdk-310p-npu-firmware_7.3.0.1.231.run - `Download <https://obs-whaicc-fae-public.obs.cn-central-221.ovaijisuan.com/cann/mindie/1.0.RC2/310p/Ascend-hdk-310p-npu-firmware_7.3.0.1.231.run>`_
72+
73+
For 910B:
74+
- Driver: Ascend-hdk-910b-npu-driver_24.1.rc3_linux-aarch64.run - `Download <https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Ascend%20HDK/Ascend%20HDK%2024.1.RC3/Ascend-hdk-910b-npu-driver_24.1.rc3_linux-aarch64.run?response-content-type=application/octet-stream>`_
75+
- Firmware: Ascend-hdk-910b-npu-firmware_7.5.0.1.129.run - `Download <https://ascend-repo.obs.cn-east-2.myhuaweicloud.com/Ascend%20HDK/Ascend%20HDK%2024.1.RC3/Ascend-hdk-910b-npu-firmware_7.5.0.1.129.run?response-content-type=application/octet-stream>`_
76+
77+
Download the `.run` packages to the host machine, and then run the following commands to install the drivers and firmware:
78+
79+
.. code-block:: bash
80+
81+
chmod +x Ascend-hdk-910b-npu-driver_24.1.rc3_linux-aarch64.run
82+
./Ascend-hdk-910b-npu-firmware_7.5.0.1.129.run --full
83+
84+
Once the installation is complete, the output should indicate "successfully," confirming the installation. The firmware installation method is the same.
85+
86+
When Mindie does not start properly, verify that the driver and firmware versions match. Both the driver and firmware must be installed on the host machine and loaded into the Docker container via mounting.
87+
88+
For version upgrades, install the firmware first, then the driver.
89+
90+
Pull the Image
91+
^^^^^^^^^^^^^^^^^^^
92+
For 310I DUO:
93+
94+
.. code-block:: bash
95+
96+
docker login [email protected] registry.cn-hangzhou.aliyuncs.com
97+
Password: cre.uwd3nyn4UDM6fzm
98+
docker pull registry.cn-hangzhou.aliyuncs.com/xinference-prod/xinference-prod:0.0.10-310p
99+
100+
For 910B:
101+
102+
.. code-block:: bash
103+
104+
docker login [email protected] registry.cn-hangzhou.aliyuncs.com
105+
Password: cre.uwd3nyn4UDM6fzm
106+
docker pull registry.cn-hangzhou.aliyuncs.com/xinference-prod/xinference-prod:0.0.10-910b
107+
108+
Run Command Example
109+
^^^^^^^^^^^^^^^^^^^
110+
To run the container, use the following command:
111+
112+
.. code-block:: bash
113+
114+
docker run --name MindIE-Xinf -it \
115+
-d \
116+
--net=host \
117+
--shm-size=500g \
118+
--privileged=true \
119+
-w /opt/projects \
120+
--device=/dev/davinci_manager \
121+
--device=/dev/hisi_hdc \
122+
--device=/dev/devmm_svm \
123+
--entrypoint=bash \
124+
-v /usr/local/Ascend/driver:/usr/local/Ascend/driver \
125+
-v /usr/local/dcmi:/usr/local/dcmi \
126+
-v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
127+
-v /usr/local/sbin:/usr/local/sbin \
128+
-v /home:/home \
129+
-v /root:/root/model \
130+
-v /tmp:/tmp \
131+
-v </your/home/path>/.xinference:/root/.xinference \
132+
-v </your/home/path>/.cache/huggingface:/root/.cache/huggingface \
133+
-v </your/home/path>/.cache/modelscope:/root/.cache/modelscope \
134+
-e http_proxy=$http_proxy \
135+
-e https_proxy=$https_proxy \
136+
registry.cn-hangzhou.aliyuncs.com/xinference-prod/xinference-prod:0.0.10-910b
137+
138+
Start Xinference
139+
^^^^^^^^^^^^^^^^^^^
140+
After starting the container, navigate to the `/opt/projects` directory inside the container and run the following command:
141+
142+
.. code-block:: bash
143+
144+
./xinf-enterprise.sh --host 192.168.10.197 --port 9997 && \
145+
XINFERENCE_MODEL_SRC=modelscope xinference-local --host 192.168.10.197 --port 9997 --log-level debug
146+
147+
The `./xinf-enterprise.sh` script starts the Nginx service and writes the Xinf service startup address to the configuration file.
148+
149+
The Xinf service startup command can be adjusted according to your needs. Adjust the `host` and `port` according to your device's configuration.
150+
151+
Once the Xinf service is started, you can access the Xinf WebUI by visiting port 8000.
152+
153+
Supported Models
154+
^^^^^^^^^^^^^^^^^^^
155+
156+
When selecting a model execution engine, we recommend using the Mindie model for faster inference speed. Other engines may have slower inference speeds and are not recommended for use.
157+
158+
Currently, Mindie supports the following large language models:
159+
160+
- baichuan-chat
161+
- baichuan-2-chat
162+
- chatglm3
163+
- deepseek-chat
164+
- deepseek-coder-instruct
165+
- llama-3-instruct
166+
- mistral-instruct-v0.3
167+
- telechat
168+
- Yi-chat
169+
- Yi-1.5-chat
170+
- qwen-chat
171+
- qwen1.5-chat
172+
- codeqwen1.5-chat
173+
- qwen2-instruct
174+
- csg-wukong-chat-v0.1
175+
- qwen2.5 series (qwen2.5-instruct, qwen2.5-coder-instruct, etc.)
176+
177+
Embedding Models:
178+
- bge-large-zh-v1.5
179+
180+
Rerank Models:
181+
- bge-reranker-large
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
.. _index:
2+
3+
======================
4+
Welcome to Xinference!
5+
======================
6+
7+
.. toctree::
8+
:maxdepth: 2
9+
:hidden:
10+
11+
getting_started/index
12+
13+
14+
Xorbits Inference (Xinference) is an open-source platform to streamline the operation and integration
15+
of a wide array of AI models. With Xinference, you're empowered to run inference using any open-source LLMs,
16+
embedding models, and multimodal models either in the cloud or on your own premises, and create robust
17+
AI-driven applications.

0 commit comments

Comments
 (0)