Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use other models compatible with the OpenAI library #1171

Open
CNFeffery opened this issue Dec 30, 2024 · 7 comments
Open

Use other models compatible with the OpenAI library #1171

CNFeffery opened this issue Dec 30, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@CNFeffery
Copy link

Such as DeepSeek ( https://api-docs.deepseek.com/ ), :

image

By supporting the configuration of parameters such as api_key, base_url, and model, various large model services compatible with the OpenAI library call methods can be used.

@CNFeffery CNFeffery added the enhancement New feature or request label Dec 30, 2024
@srdas
Copy link
Collaborator

srdas commented Dec 30, 2024

Definitely worth looking into. For the moment note that several deepseek models are also available through Ollama: https://ollama.com/library/deepseek-llm.
image
You may use these models with Jupyter AI:
image

@CNFeffery
Copy link
Author

CNFeffery commented Dec 31, 2024

Definitely worth looking into. For the moment note that several deepseek models are also available through Ollama: https://ollama.com/library/deepseek-llm. image You may use these models with Jupyter AI: image

Although DeepSeek V3 has released an open-source model, the model parameters are quite large, making self-deployment based on Ollama impractical. Typically, people use DeepSeek's official online API service.

@zgpnuaa
Copy link

zgpnuaa commented Jan 2, 2025

Deepseek is comparable in capability to Claude and ChatGPT, but its price is much cheaper, offering great value for money.

@dlqqq
Copy link
Member

dlqqq commented Jan 3, 2025

Thank you all for the additional context! I think there is a way to bring Deepseek support into Jupyter AI, even though it lacks a LangChain partner package:

  • Subclass ChatOpenAI like so:
class DeepseekProvider(BaseProvider, ChatOpenAI):
  def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)
    self.openai_api_base = "https://api.deepseek.com"
  • Set auth_strategy = EnvAuthStrategy(name="DEEPSEEK_API_KEY") as a class attribute to give this provider its own unique API key to save to the config.

@srdas I've outlined some steps to try above. I'll assign this issue to you, so feel free to work on this issue if you have spare time.

@srdas
Copy link
Collaborator

srdas commented Jan 3, 2025

I was working on adding the class DeepseekProvider as suggested above, and realized that it is already provided for via the OpenRouter provider, which was written to accommodate cases where langchain_openai is used with class ChatOpenAI. See: this code

The approach uses deepseek-chat in Jupyter AI using the OpenRouter provider in Jupyter AI as follows:
image
Simply enter the Deepseek API key in the OPENROUTER_API_KEY.

I tested to see that this works:
image
This enables the use of Deepseek models in Jupyter AI.

We can surely add a new DeepseekProvider class which would mimic the code in openrouter.py but this would be redundant? Though the UX would be better. Wondering what folks think?

@srdas
Copy link
Collaborator

srdas commented Jan 3, 2025

Another approach would be to enhance langchain_openai explicitly with a new module deepseek.py here along the lines of azure.py.

@CNFeffery
Copy link
Author

@srdas is it possible to add a general API Provider OpenAl Compatible like cline, to achieve compatibility with all large model interfaces that conform to the OpenAI calling method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants