- All LLMs/VLMs/CLIPs serve as API with cache enabled, because loading a LLM/VLM/CLIP is expensive and we never modify them.
- LLM functions in
utils_llm.py
, VLM functions inutils_vlm.py
, CLIP functions inutils_clip.py
, and others inutils_general.py
. - Write unit tests to understand major functions.
- Set up OpenAI API key:
export OPENAI_API_KEY='[your key]'
- Pip install environments:
pip install vllm
- Configure global variables in
global_vars.py
- Run
python -m vllm.entrypoints.openai.api_server --model lmsys/vicuna-7b-v1.5
- Run
python -m serve.utils_llm
to test the LLM.
- Pip install environments:
pip install open-clip-torch flask
- Configure global variables in
global_vars.py
- Run
python serve/clip_server.py
- Run
python -m serve.utils_clip
to test the CLIP.
- Install environments:
- BLIP:
pip install salesforce-lavis
- LLaVA:
git clone [email protected]:haotian-liu/LLaVA.git; cd LLaVA; pip install -e .
- Configure global variables in
global_vars.py
- Run
python serve/vlm_server_[vlm].py
. It takes a while to load the VLM, especially the first time to download the VLM. (Note: concurrency is disabled as it surprisingly leads to worse GPU utilization) - Run
python -m serve.utils_vlm
to test the VLM.