Skip to content

yuvrajangadsingh/vemb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vemb

httpie for embeddings. Embed text, images, audio, video, and PDFs from the command line.

vemb demo

pipx install vemb
export GEMINI_API_KEY=your_key
vemb text "hello world"

Powered by Gemini Embedding 2, the first natively multimodal embedding model. One model, one vector space for everything.

Install

pipx install vemb
# or
pip install vemb

Get a free API key at https://aistudio.google.com/apikey

export GEMINI_API_KEY=your_key

Commands

vemb text "hello world"                    # embed text
vemb embed photo.jpg                       # embed any file (auto-detects type)
vemb embed *.jpg --jsonl                   # batch embed, one JSON per line
vemb image photo.jpg                       # embed image (PNG, JPEG)
vemb audio clip.mp3                        # embed audio (MP3, WAV)
vemb video clip.mp4                        # embed video (MP4, MOV)
vemb pdf doc.pdf                           # embed PDF
vemb similar photo1.jpg photo2.jpg         # cosine similarity between two files
vemb search ./photos "sunset at beach"     # search a directory

Pipe from stdin:

echo "hello world" | vemb text -
cat document.txt | vemb text -

Output

Default output is JSON:

{
  "model": "gemini-embedding-2-preview",
  "dimensions": 768,
  "values": [0.012, -0.034, ...]
}

Options:

vemb text "hello" --compact                # just the vector array
vemb text "hello" --numpy                  # numpy format
vemb text "hello" --dim 768                # set dimensions (128-3072)
vemb text "hello" --task RETRIEVAL_QUERY   # set task type

Batch mode outputs JSONL (one embedding per line):

vemb embed *.jpg --jsonl > embeddings.jsonl

Search

Search indexes a directory and finds files similar to your query:

vemb search ./photos "sunset at beach" --top 5

Embeddings are cached in .vemb/cache.json inside the searched directory. Unchanged files won't be re-embedded on subsequent searches.

Supported formats

Type Formats
Text any string, stdin
Image PNG, JPEG
Audio MP3, WAV (up to 80s)
Video MP4, MOV (up to 128s)
PDF up to 6 pages

License

MIT

About

httpie for embeddings. Embed text, images, audio, video, and PDFs from the command line.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages