Skip to content

huggingface: Reduce disk footprint by 95% by making large dependencies optional #31268

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Simon-Stone
Copy link
Contributor

Description:
langchain_huggingface has a very large installation size of around 600 MB (on a Mac with Python 3.11). This is due to its dependency on sentence-transformers, which in turn depends on torch, which is 320 MB all by itself. Similarly, the depedency on transformers adds another set of heavy dependencies. With those dependencies removed, the installation of langchain_huggingface only takes up ~26 MB. This is only 5 % of the full installation!

These libraries are not necessary to use langchain_huggingface's API wrapper classes, only for local inferences/embeddings. All import statements for those two libraries already have import guards in place (try/catch with a helpful "please install x" message).

This PR therefore moves those two libraries to an optional dependency group full. So a pip install langchain_huggingface will only install the lightweight version, and a pip install "langchain_huggingface[full]" will install all dependencies.

I know this may break existing code, because sentence-transformers and transformers are now no longer installed by default. Given that users will see helpful error messages when that happens, and the major impact of this small change, I hope that you will still consider this PR.

Dependencies: No new dependencies, but new optional grouping.

Copy link

vercel bot commented May 17, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview May 17, 2025 10:47pm

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. langchain Related to the langchain package labels May 17, 2025
@Simon-Stone
Copy link
Contributor Author

@ccurme Is this change something you might consider?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
langchain Related to the langchain package size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant