Add Offline DeepSeek Model #82

Barqawiz · 2025-01-25T23:40:45Z

Implement an offline DeepSeek model loader for inference that:

Loads DeepSeek models directly from the official host (HuggingFace).
Supports both full and quantized versions (if available).
Implement memory optimization techniques similar to llama.cpp or ollama.

Expected Deliverables:

Develop a wrapper for the DeepSeek model.
Create a folder under model/deepseek for helpers and extended functions used by the wrapper.
Write a test case to load the model, to be placed in the test directory.
Mention which pip used for the code like torch. Don't use higher level modules like transformers.

Notes:

For reference, check the current (non-optimized) Python code from DeepSeek repo:
https://github.com/deepseek-ai/DeepSeek-V3/tree/main/inference
llama.cpp is a reference for optimized model loading techniques.
Intelli should provide easy way to load the model.
Don't use high level pip module like transformers (intellinode provide light weight integration for AI agents).
You can use torch, tensorflow, keras, safetensors, triton, etc. (these modules recommended)

intelligentnode · 2025-02-04T13:26:48Z

/bounty $800

algora-pbc · 2025-02-04T13:26:54Z

💎 $800 bounty • IntelliNode

Steps to solve:

Start working: Comment /attempt #82 with your implementation plan
Submit work: Create a pull request including /claim #82 in the PR body to claim the bounty
Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to intelligentnode/Intelli!

Add a bounty • Share on socials

Attempt	Started (GMT+0)	Solution
🟢 @onyedikachi-david	Feb 4, 2025, 5:57:41 PM	WIP
🟢 @Enity300	Feb 19, 2025, 11:36:41 AM	WIP
🟢 @RaghavArora14	Feb 21, 2025, 9:41:03 PM	#94

onyedikachi-david · 2025-02-04T17:57:39Z

/attempt #82

Algora profile	Completed bounties	Tech	Active attempts	Options
@onyedikachi-david	17 bounties from 7 projects	TypeScript, Python, Rust & more		Cancel attempt

oliverqx · 2025-02-04T19:31:17Z

@Barqawiz Have you considered leveraging Ollama for loading and running the DeepSeek models instead?

By building a lightweight wrapper to integrate with Ollama, we could create an API for AI agents to interact with local models. Not only supporting DeepSeek but also any other models compatible with Ollama. This would simplify development, reduce overhead, and ensure we're tapping into the optimizations that Ollama offers for efficient model inference and memory management. I think a custom model loader would be quite difficult to maintain later on

This approach would make it easier to scale and support a broader range of models in the future as well

intelligentnode · 2025-02-04T19:46:46Z

Good question @oliverqx. Let me explain the rationale behind using an offline model and why I'm avoiding Ollama or similar high-level modules.

Intelli can build a graph of collaborating agents using the flow concept: Sequence Flow Documentation.

I've managed to integrate multiple offline models into the flow using the KerasWrapper, which provides a convenient way to load several offline models, such as Llama, Mistral, and others: KerasWrapper class.

However, Keras does not currently support DeepSeek, and adding that functionality will likely take some time from Keras team. As a result, my current focus is on DeepSeek.

I avoid using Ollama because I want to minimize external dependencies. I looked into Ollama as a high-level library, and integrating it would introduce additional unnecessary modules - the same thing with HF Transformers.

You can influence how Ollama uses modules like Torch, optimization libraries, or use Safetensors from HuggingFace. These lower-level modules are accepted. I'm happy to credit their work if you influence their approaches, but I prefer not to have Ollama as a dependency for running the flow.

Feel free to use o1 or o3 or deepseek to write any part the code.

oliverqx · 2025-02-04T20:20:38Z

@intelligentnode are there any specific DeepSeek variants you'd prefer?

intelligentnode · 2025-02-04T20:27:22Z

You can use the official ones from R1 or any quantized variant:

DeepSeek-R1 Models

Model	#Total Params	#Activated Params	Context Length	Download
DeepSeek-R1	671B	37B	128K	🤗 HuggingFace

DeepSeek-R1-Distill Models

Model	Base Model	Download
DeepSeek-R1-Distill-Qwen-1.5B	Qwen2.5-Math-1.5B	🤗 HuggingFace
DeepSeek-R1-Distill-Qwen-7B	Qwen2.5-Math-7B	🤗 HuggingFace
DeepSeek-R1-Distill-Llama-8B	Llama-3.1-8B	🤗 HuggingFace
DeepSeek-R1-Distill-Qwen-14B	Qwen2.5-14B	🤗 HuggingFace
DeepSeek-R1-Distill-Qwen-32B	Qwen2.5-32B	🤗 HuggingFace
DeepSeek-R1-Distill-Llama-70B	Llama-3.3-70B-Instruct	🤗 HuggingFace

In general it is going to be expensive to run DeepSeek-R1. But you can test the code on the 7B or 8B models to be accept as attempt. Also if you know a hosted quantized version from R1 you can test on it.

oliverqx · 2025-02-04T22:29:17Z

@intelligentnode for tokenization, is it alright to use AutoTokenizer from transformers?

i know:

Don't use high level pip module like transformers

was wondering if that also applies to tokenization though

intelligentnode · 2025-02-04T23:08:01Z

If it requires installing Transformers to use it, then no.
If it is published as an independent module with a lightweight installation via pip, then yes.
You can implement a lightweight version of Tokenizer, if an independent one is not available.

Enity300 · 2025-02-19T11:36:38Z

/attempt #82

Options

Cancel my attempt

oliverqx · 2025-02-19T16:17:12Z

@Enity300 would you like to work on this together? its quite the task i think

intelligentnode · 2025-02-19T16:53:00Z

@oliverqx It is good that you mentioned this. With your collaboration, I can assist you with relevant chunks of the code that require stitching, and testing.

Send me your email using the below form, and will organize a call:
https://www.intellinode.ai/contact

Mention you are from Github.

oliverqx · 2025-02-19T17:06:12Z

@intelligentnode

so far ive studied the deep seek model repo, this week ive been studying llama.cpp, sometime this weekend i think i can open a PR.

until then, work is very theoretical. Will definitely hit you up once i get a solid grasp on what optimization means

im not an ai dev so this is a lot of new info, this why its taking so long. I'm confident i can come to a solution sometime mid march

RaghavArora14 · 2025-02-21T21:41:01Z

/attempt 82

Options

Cancel my attempt

algora-pbc · 2025-02-21T22:01:09Z

💡 @RaghavArora14 submitted a pull request that claims the bounty. You can visit your bounty board to reward.

Barqawiz changed the title ~~Add DeepSeek Model~~ Add Offline DeepSeek Model Feb 1, 2025

intelligentnode added size-large enhancement New feature or request labels Feb 4, 2025

algora-pbc bot added the 💎 Bounty label Feb 4, 2025

RaghavArora14 added a commit to RaghavArora14/Intelli that referenced this issue Feb 21, 2025

Fix: Add Offline DeepSeek Model intelligentnode#82

84e596f

RaghavArora14 mentioned this issue Feb 21, 2025

Fix: Add Offline DeepSeek Model #82 #94

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Offline DeepSeek Model #82

Add Offline DeepSeek Model #82

Barqawiz commented Jan 25, 2025 •

edited

Loading

intelligentnode commented Feb 4, 2025 •

edited

Loading

algora-pbc bot commented Feb 4, 2025 •

edited

Loading

onyedikachi-david commented Feb 4, 2025 •

edited by algora-pbc bot

Loading

oliverqx commented Feb 4, 2025 •

edited

Loading

intelligentnode commented Feb 4, 2025 •

edited

Loading

oliverqx commented Feb 4, 2025

intelligentnode commented Feb 4, 2025 •

edited

Loading

oliverqx commented Feb 4, 2025 •

edited

Loading

intelligentnode commented Feb 4, 2025

Enity300 commented Feb 19, 2025 •

edited by algora-pbc bot

Loading

oliverqx commented Feb 19, 2025

intelligentnode commented Feb 19, 2025

oliverqx commented Feb 19, 2025 •

edited

Loading

RaghavArora14 commented Feb 21, 2025 •

edited by algora-pbc bot

Loading

algora-pbc bot commented Feb 21, 2025

Add Offline DeepSeek Model #82

Add Offline DeepSeek Model #82

Comments

Barqawiz commented Jan 25, 2025 • edited Loading

intelligentnode commented Feb 4, 2025 • edited Loading

algora-pbc bot commented Feb 4, 2025 • edited Loading

💎 $800 bounty • IntelliNode

Steps to solve:

onyedikachi-david commented Feb 4, 2025 • edited by algora-pbc bot Loading

oliverqx commented Feb 4, 2025 • edited Loading

intelligentnode commented Feb 4, 2025 • edited Loading

oliverqx commented Feb 4, 2025

intelligentnode commented Feb 4, 2025 • edited Loading

oliverqx commented Feb 4, 2025 • edited Loading

intelligentnode commented Feb 4, 2025

Enity300 commented Feb 19, 2025 • edited by algora-pbc bot Loading

oliverqx commented Feb 19, 2025

intelligentnode commented Feb 19, 2025

oliverqx commented Feb 19, 2025 • edited Loading

RaghavArora14 commented Feb 21, 2025 • edited by algora-pbc bot Loading

algora-pbc bot commented Feb 21, 2025

Barqawiz commented Jan 25, 2025 •

edited

Loading

intelligentnode commented Feb 4, 2025 •

edited

Loading

algora-pbc bot commented Feb 4, 2025 •

edited

Loading

onyedikachi-david commented Feb 4, 2025 •

edited by algora-pbc bot

Loading

oliverqx commented Feb 4, 2025 •

edited

Loading

intelligentnode commented Feb 4, 2025 •

edited

Loading

intelligentnode commented Feb 4, 2025 •

edited

Loading

oliverqx commented Feb 4, 2025 •

edited

Loading

Enity300 commented Feb 19, 2025 •

edited by algora-pbc bot

Loading

oliverqx commented Feb 19, 2025 •

edited

Loading

RaghavArora14 commented Feb 21, 2025 •

edited by algora-pbc bot

Loading