Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image Embedding Support #4125

Open
faradaytrs opened this issue Dec 18, 2024 · 3 comments
Open

Image Embedding Support #4125

faradaytrs opened this issue Dec 18, 2024 · 3 comments
Labels

Comments

@faradaytrs
Copy link

Feature Description

Currently, the Vercel AI SDK only supports text embedding models. We would like to request the addition of support for creating embeddings for images as well.

As image-based machine learning and multi-modal AI applications become more common, the ability to generate embeddings for images would greatly enhance the SDK’s capabilities. This feature would allow developers to leverage image similarity, classification, and other AI-driven tasks directly within the Vercel ecosystem.

Use Cases

  1. Image Similarity: Create vector embeddings from images and compare them for similarity.
  2. Multi-modal Applications: Combine text and image embeddings for more comprehensive AI models (e.g., image captioning, visual question answering).
  3. Image Classification: Use embeddings for classification tasks where images need to be mapped to specific categories.

Additional context

We believe this feature would provide significant value and open up new possibilities for developers using the Vercel AI SDK. Looking forward to seeing this feature in a future release!

@faradaytrs faradaytrs added the enhancement New feature or request label Dec 18, 2024
@lgrammel
Copy link
Collaborator

Please list providers that support image embeddings.

@lgrammel lgrammel changed the title Request for Image Embedding Support Image Embedding Support Dec 18, 2024
@faradaytrs
Copy link
Author

Please list providers that support image embeddings.

For example Amazon Titan Multimodal Embeddings G1 model can do it, there are other models as well, if you need more i can make a research.

@lgrammel
Copy link
Collaborator

We'll probably hold back on this for now until major providers such as OpenAI support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants