diff --git a/README.md b/README.md index 8143f1ac74..5df32623cc 100644 --- a/README.md +++ b/README.md @@ -49,54 +49,24 @@ chmod +x llava-v1.5-7b-q4-server.llamafile **Having trouble? See the "Gotchas" section below.** -### API Quickstart / Alternative to OpenAI API endpoint - -Once llamafile server has started, in addition to directly accessing the chat server via a json based API endpoint is also provided. - -If you have existing OpenAI based application code relying on OpenAI API endpoint as per [OpenAI Chat Completions API documentation](https://platform.openai.com/docs/api-reference/chat), our API endpoint under base url `http://localhost:8080/v1` is designed to support most OpenAI use cases besides certain OpenAI-specific features such as function calling ( llama.cpp `/completion`-specific features such are `mirostat` are supported.). - -For further details on all supported API commands (OpenAI compatible to llamafile specific extention) please refer to [API Endpoint Documentation](llama.cpp/server/README.md#api-endpoints). - -#### LLAMAFile Server V1 API Python Example - -This shows that you can use existing [OpenAI python package](https://pypi.org/project/openai/) developed by OpenAI because of our compatibility measures. -So most scripts designed for OpenAI will be able to be ported to llamafile with a few changes to base_url and api_key. - -
-Python Example Code and Result - -Don't forget to run this command `pip3 install openai` to install the openai package required by this example script. This package is just a simple python wrapper around the openAI's API endpoints. - -```python -#!/usr/bin/env python3 -from openai import OpenAI -client = OpenAI( - base_url="http://localhost:8080/v1", # "http://:port" - api_key = "sk-no-key-required" -) -completion = client.chat.completions.create( - model="LLaMA_CPP", - messages=[ - {"role": "system", "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."}, - {"role": "user", "content": "Write a limerick about python exceptions"} - ] -) -print(completion.choices[0].message) -``` - -The above when run would return a python object that may look like below: - -```python -ChatCompletionMessage(content='There once was a programmer named Mike\nWho wrote code that would often strike\nAn error would occur\nAnd he\'d shout "Oh no!"\nBut Python\'s exceptions made it all right.', role='assistant', function_call=None, tool_calls=None) -``` - -
- - -#### LLAMAFile Server V1 API Raw HTTP Request Example +### JSON API Quickstart + +When llamafile is started in server mode, in addition to hosting a web +UI chat server at , an [OpenAI +API](https://platform.openai.com/docs/api-reference/chat) chat +completions endpoint is provided too. It's designed to support the most +common OpenAI API use cases, in a way that runs entirely locally. We've +also extended it to include llama.cpp specific features (e.g. mirostat) +that may also be used. For further details on what fields and endpoints +are available, refer to both the [OpenAI +documentation](https://platform.openai.com/docs/api-reference/chat/create) +and the [llamafile server +README](llama.cpp/server/README.md#api-endpoints). + +#### Examples
-Raw HTTP Request Example Command and Result +Curl API Client Example ```shell curl http://localhost:8080/v1/chat/completions \ @@ -145,6 +115,44 @@ The above when run would return an answer like
+
+Python API Client example + +If you've already developed your software using the [`openai` Python +package](https://pypi.org/project/openai/) (that's published by OpenAI) +then you should be able to port your app to talk to a local llamafile +instead, by making a few changes to `base_url` and `api_key`. + +This example assumes you've run `pip3 install openai` to install +OpenAI's client software, which is required by this example. Their +package is just a simple Python wrapper around the OpenAI's API +endpoints. + +```python +#!/usr/bin/env python3 +from openai import OpenAI +client = OpenAI( + base_url="http://localhost:8080/v1", # "http://:port" + api_key = "sk-no-key-required" +) +completion = client.chat.completions.create( + model="LLaMA_CPP", + messages=[ + {"role": "system", "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."}, + {"role": "user", "content": "Write a limerick about python exceptions"} + ] +) +print(completion.choices[0].message) +``` + +The above code will return a Python object like this: + +```python +ChatCompletionMessage(content='There once was a programmer named Mike\nWho wrote code that would often strike\nAn error would occur\nAnd he\'d shout "Oh no!"\nBut Python\'s exceptions made it all right.', role='assistant', function_call=None, tool_calls=None) +``` + +
+ ## Other example llamafiles