feat: add Google embedding integration #1304

bwook00 · 2025-07-24T05:52:20Z

Description

Add Google Embedding provider

However, if you're worried about the ‘langchain dependency’, I can change the method to directly use from google import genai.

Related Issue(s)

Fixes #1292

Checklist

I've read the CONTRIBUTING guidelines.
I've updated the documentation if applicable.
I've added tests if applicable.
@mentions of the person or team responsible for reviewing proposed changes.

github-actions · 2025-07-29T11:41:05Z

Documentation preview

https://nvidia.github.io/NeMo-Guardrails/review/pr-1304

Copilot

Pull Request Overview

This PR adds Google embedding integration to the NeMo Guardrails project by implementing a new GoogleEmbeddingModel provider that uses the langchain-google-genai library.

Implements GoogleEmbeddingModel class with sync/async embedding capabilities
Adds comprehensive test suite for Google embeddings functionality
Updates documentation to include Google as a supported embedding provider

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
nemoguardrails/embeddings/providers/google.py	Main implementation of GoogleEmbeddingModel class with encoding methods
nemoguardrails/embeddings/providers/init.py	Registers the Google embedding provider in the system
tests/test_embeddings_google.py	Comprehensive test suite including sync/async tests and live integration tests
tests/test_configs/with_google_embeddings/config.yml	Test configuration for Google embeddings
tests/test_configs/with_google_embeddings/config.co	Test flow configuration
docs/user-guides/configuration-guide.md	Documentation update adding Google to supported providers table

Comments suppressed due to low confidence (1)

tests/test_embeddings_google.py:70

This function has the same name as the async function on line 52 but different signature. Consider renaming to 'test_sync_live_query' to differentiate from the async version.

def test_live_query(app):

Copilot · 2025-07-29T12:06:53Z

nemoguardrails/embeddings/providers/google.py

+            self.embedding_size = self.embedding_size_dict[self.model]
+        else:
+            # Perform a first encoding to get the embedding size
+            self.embedding_size = len(self.encode(["test"])[0])


Making an actual API call during initialization to determine embedding size could cause unnecessary latency and API costs. Consider using a placeholder or lazy initialization approach.

Suggested change

self.embedding_size = self.embedding_size_dict[self.model]

else:

# Perform a first encoding to get the embedding size

self.embedding_size = len(self.encode(["test"])[0])

self._embedding_size = self.embedding_size_dict[self.model]

else:

# Defer embedding size determination until it is accessed

self._embedding_size = None

Copilot · 2025-07-29T12:06:53Z

nemoguardrails/embeddings/providers/google.py

+        self.embedding_size_dict = {
+            "gemini-embedding-001": 3072,
+            "text-embedding-005": 768,
+            "text-multilingual-embedding-002": 768,
+        }
+
+        if self.model in self.embedding_size_dict:
+            self.embedding_size = self.embedding_size_dict[self.model]


[nitpick] The embedding size dictionary is hardcoded in the constructor. Consider moving this to a class-level constant or configuration file to improve maintainability when new models are added.

Suggested change

self.embedding_size_dict = {

"gemini-embedding-001": 3072,

"text-embedding-005": 768,

"text-multilingual-embedding-002": 768,

}

if self.model in self.embedding_size_dict:

self.embedding_size = self.embedding_size_dict[self.model]

# Mapping of embedding models to their respective sizes.

embedding_size_dict = {

"gemini-embedding-001": 3072,

"text-embedding-005": 768,

"text-multilingual-embedding-002": 768,

}

if self.model in self.__class__.embedding_size_dict:

self.embedding_size = self.__class__.embedding_size_dict[self.model]

codecov-commenter · 2025-07-30T06:30:55Z

Codecov Report

❌ Patch coverage is 40.90909% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.40%. Comparing base (bee719b) to head (4e0cd50).

Files with missing lines	Patch %	Lines
nemoguardrails/embeddings/providers/google.py	35.00%	13 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1304      +/-   ##
===========================================
- Coverage    70.45%   70.40%   -0.05%     
===========================================
  Files          161      162       +1     
  Lines        16214    16235      +21     
===========================================
+ Hits         11423    11431       +8     
- Misses        4791     4804      +13

Flag	Coverage Δ
python	`70.40% <40.90%> (-0.05%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
nemoguardrails/embeddings/providers/__init__.py	`96.42% <100.00%> (+0.13%)`	⬆️
nemoguardrails/embeddings/providers/google.py	`35.00% <35.00%> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

bwook00 added 6 commits July 24, 2025 14:01

add google embedding provider

0c82f8c

add Google at configuration-guide.md

2307439

add Google at provider's init

5c3ad77

add embedding_size

c592c06

add test code

b11668f

add blank line

450ff0d

bwook00 mentioned this pull request Jul 24, 2025

feat: add Cohere embedding integration #1305

Open

4 tasks

Pouyanpi requested review from Copilot and Pouyanpi July 29, 2025 12:06

Copilot AI reviewed Jul 29, 2025

View reviewed changes

bwook00 added 2 commits July 30, 2025 15:17

Merge branch 'develop' into feature/googleembedding

dcb5ade

run pre-commit

4e0cd50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add Google embedding integration #1304

feat: add Google embedding integration #1304

bwook00 commented Jul 24, 2025

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 29, 2025

Uh oh!

Copilot AI Jul 29, 2025

Uh oh!

codecov-commenter commented Jul 30, 2025

Uh oh!

Uh oh!

feat: add Google embedding integration #1304

Are you sure you want to change the base?

feat: add Google embedding integration #1304

Conversation

bwook00 commented Jul 24, 2025

Description

Related Issue(s)

Checklist

Uh oh!

github-actions bot commented Jul 29, 2025

Documentation preview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

codecov-commenter commented Jul 30, 2025

Codecov Report

Uh oh!

Uh oh!