-
Notifications
You must be signed in to change notification settings - Fork 38
feat(server): add support for running server in Node #365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes update the server’s build and runtime environment by modifying dependencies, compiler options, and scripts. TypeScript configuration now outputs to a dedicated folder while certain options are disabled. Various modules have been refactored to adopt ES module syntax and environment-aware logic. New utility functions and an Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client Process
participant EnvChecker as getEnvironment
participant Worker as Google Worker
Client->>Worker: Send task/message
Worker->>EnvChecker: Check runtime environment
alt Running in Node
Worker->>parentPort: Handle message via parentPort.on("message")
else Browser
Worker->>self: Handle message via self.onmessage
end
Worker-->>Client: Return response
sequenceDiagram
participant Browser as Client Browser
participant EnvChecker as Environment Checker
participant Server as WebSocket Server
Browser->>Server: Request WebSocket connection (/ws)
Server->>EnvChecker: Determine runtime environment
alt Node Environment
Server->>Server: Initialize WebSocketServer (using ws package)
else Bun Environment
Server->>Server: Use Bun static file serving & WebSocket implementation
end
Server-->>Browser: Establish WebSocket connection
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @kalpadhwaryu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
This pull request introduces the ability to run server-side JavaScript files using Node.js. It includes the addition of necessary dependencies, updates to package configurations, and modifications to context construction and evaluation processes to support this functionality. The changes primarily focus on enabling server-side execution of AI-related tasks.
Highlights
- Dependency Injection: Added
@hono/node-server
as a dependency to enable running server JS files in Node. - Context Construction: Modified context construction functions in
server/distBE/ai/context.js
to enhance information provided to the AI, including details about the user, workspace, and current time. - Evaluation Process: Updated the evaluation process in
server/distBE/ai/eval.js
to incorporate permission-aware evaluations and various RAG (Retrieval-Augmented Generation) strategies. - Model Mapping: Added new AI models and their corresponding cost details in
server/distBE/ai/mappers.js
, along with mappings to their respective providers. - Prompt Engineering: Introduced and modified prompts in
server/distBE/ai/prompts.js
to optimize AI responses, including query rewriting and temporal event classification.
Changelog
Click here to see the changelog
- package-lock.json
- Added
@hono/node-server
dependency and its related dependencies.
- Added
- package.json
- Added
@hono/node-server
dependency.
- Added
- server/distBE/ai/contants.js
- Added an empty export object.
- server/distBE/ai/context.js
- Modified context construction functions to include more relevant information.
- Added colored context construction for better readability.
- Implemented context cleaning functions to remove irrelevant characters and highlights.
- server/distBE/ai/eval.js
- Implemented permission-aware evaluations.
- Added various RAG strategies, including basic, iterative, and time-expansion RAG.
- Integrated query rewriting and temporal event classification.
- Simplified search result simplification for better context retrieval.
- server/distBE/ai/mappers.js
- Added new AI models and their cost details.
- Implemented mappings between models and providers.
- server/distBE/ai/prompts.js
- Introduced and modified prompts for optimized AI responses.
- Added prompts for query rewriting, temporal event classification, and end-to-end integration.
- server/distBE/ai/provider/base.js
- Added base provider class with model parameter handling.
- server/distBE/ai/provider/bedrock.js
- Implemented Bedrock provider with streaming and non-streaming conversation capabilities.
- server/distBE/ai/provider/fireworks.js
- Implemented Fireworks provider with streaming and non-streaming conversation capabilities.
- server/distBE/ai/provider/fireworksClient.js
- Implemented Fireworks client for making API requests.
- server/distBE/ai/provider/gemini.js
- Implemented Gemini provider with streaming and non-streaming conversation capabilities.
- server/distBE/ai/provider/index.js
- Consolidated provider initialization and access logic.
- Implemented various AI tasks, including question answering, query analysis, and query rewriting.
- server/distBE/ai/provider/ollama.js
- Implemented Ollama provider with streaming and non-streaming conversation capabilities.
- server/distBE/ai/provider/openai.js
- Implemented OpenAI provider with streaming and non-streaming conversation capabilities.
- server/distBE/ai/provider/together.js
- Implemented Together provider with streaming and non-streaming conversation capabilities.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
In nodes where servers reside,
JavaScript's power cannot hide.
AI's tasks ignite,
With code taking flight,
A digital, clever guide.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
The pull request adds the ability to run server-side JavaScript files using Node.js. This includes adding necessary dependencies and creating new files for AI-related functionalities. The changes seem well-structured, but there are a few areas that could benefit from closer examination.
Summary of Findings
- Missing email and workspace ID: The
myEmail
andworkspaceId
variables inserver/distBE/ai/eval.js
are initialized as empty strings, which will cause issues if the code relies on these values. These should be populated with actual values or retrieved from a configuration. - Incomplete error handling: The
converseStream
function inserver/distBE/ai/provider/bedrock.js
has a try-catch block, but it only logs the error and rethrows it. Consider adding more robust error handling, such as providing a default response or gracefully terminating the stream. - Potential security risk: The
apiKey
variable inserver/distBE/ai/provider/fireworksClient.js
is hardcoded. Storing API keys directly in the code is a security risk. These should be stored securely, such as in environment variables or a dedicated secrets management system. - Missing cost calculation: The
converse
andconverseStream
functions inserver/distBE/ai/provider/fireworks.js
have a hardcoded cost of 0. The cost calculation logic should be implemented to accurately reflect the usage of the Fireworks AI service.
Merge Readiness
The pull request has several issues that need to be addressed before merging. The missing email and workspace ID, incomplete error handling, potential security risk, and missing cost calculation are all significant concerns that should be resolved to ensure the code's correctness, security, and maintainability. I am unable to approve this pull request, and recommend that it not be merged until these issues are addressed.
const Logger = getLogger(Subsystem.AI); | ||
export class FireworksProvider extends BaseProvider { | ||
constructor(client) { | ||
super(client, AIProviders.Fireworks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The apiKey
variable is hardcoded. Storing API keys directly in the code is a security risk. These should be stored securely, such as in environment variables or a dedicated secrets management system.
super(client, AIProviders.Fireworks); | |
apiKey = process.env.FIREWORKS_API_KEY || "YOUR_API_KEY"; // Retrieve from environment variable |
server/distBE/ai/eval.js
Outdated
const myEmail = ""; | ||
// workspace external Id | ||
const workspaceId = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These variables are initialized as empty strings. If the code relies on these values, it will cause issues. Consider populating these with actual values or retrieving them from a configuration.
const myEmail = ""; | |
// workspace external Id | |
const workspaceId = ""; | |
const myEmail = process.env.MY_EMAIL || "[email protected]"; | |
const workspaceId = process.env.WORKSPACE_ID || "defaultWorkspaceId"; |
max_tokens: modelParams.maxTokens, | ||
stream: false, | ||
}); | ||
const cost = 0; // Explicitly setting 0 as cost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
})) { | ||
yield { | ||
text: chunk, | ||
cost: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server/distBE/ai/provider/bedrock.js
Outdated
catch (error) { | ||
Logger.error(error, "Error in converseStream of Bedrock"); | ||
throw error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The converseStream
function has a try-catch block, but it only logs the error and rethrows it. Consider adding more robust error handling, such as providing a default response or gracefully terminating the stream.
catch (error) {
Logger.error(error, "Error in converseStream of Bedrock");
yield { text: "An error occurred while processing the stream.", metadata: "error" }; // Provide a default response
// throw error; // Rethrow the error to stop further processing
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 54
🧹 Nitpick comments (154)
package.json (1)
1-5
: Minimal package.json configuration.This package.json file only includes a single dependency on
@hono/node-server
. Consider enhancing it with additional standard fields likename
,version
,description
,scripts
, andengines
to better document the project and provide convenience commands.{ "dependencies": { "@hono/node-server": "^1.14.0" + }, + "name": "your-project-name", + "version": "0.1.0", + "description": "Description of your project", + "scripts": { + "start": "node server/distBE/main.js", + "dev": "node --watch server/distBE/main.js" } }server/distBE/utils/index.js (2)
1-9
: Add documentation and make date formatting configurable.The
getDateForAI
function uses a hardcoded "en-GB" locale which may not be suitable for all deployments.+/** + * Returns the current date formatted for AI interactions + * @param {string} locale - The locale to use for date formatting, defaults to "en-GB" + * @returns {string} Formatted date string + */ -export function getDateForAI() { +export function getDateForAI(locale = "en-GB") { const today = new Date(); const options = { day: "2-digit", month: "2-digit", year: "numeric", }; - return today.toLocaleDateString("en-GB", options); + return today.toLocaleDateString(locale, options); }
10-14
: Add input validation and documentation to cost calculation.The
calculateCost
function lacks input validation and documentation.+/** + * Calculates the cost of AI operations based on input and output tokens + * @param {Object} params - The token parameters + * @param {number} params.inputTokens - Number of input tokens used + * @param {number} params.outputTokens - Number of output tokens used + * @param {Object} cost - The cost configuration + * @param {number} cost.pricePerThousandInputTokens - Price per 1000 input tokens + * @param {number} cost.pricePerThousandOutputTokens - Price per 1000 output tokens + * @returns {number} The calculated cost + */ -export const calculateCost = ({ inputTokens, outputTokens }, cost) => { +export const calculateCost = ({ inputTokens = 0, outputTokens = 0 } = {}, cost) => { + // Validate inputs + if (!cost || typeof cost !== 'object') { + throw new Error('Cost configuration is required'); + } + if (!cost.pricePerThousandInputTokens || !cost.pricePerThousandOutputTokens) { + throw new Error('Price per thousand tokens is required for both input and output'); + } const inputCost = (inputTokens / 1000) * cost.pricePerThousandInputTokens; const outputCost = (outputTokens / 1000) * cost.pricePerThousandOutputTokens; return inputCost + outputCost; };server/distBE/vespa/replaceDIMS.js (2)
1-9
: Validate negative or zero dimension
Currently, only non-numeric dimensions are rejected. Including a check for negative or zero values can prevent invalid replacements.
22-31
: Consider more robust error handling
Catching errors individually is fine, but you might prefer a centralized approach or continue processing remaining files after logging the error.server/distBE/doc.js (3)
1-13
: Potential duplication with existing utility
cleanText
appears similar to the function inserver/distBE/chunks.js
. Consider consolidating into a shared helper to avoid repetitive code.
32-47
: Handle missing alt text
If bothtitle
andalt
are absent, consider falling back to a default string for accessibility.
171-205
: Potential formatting edge cases
Merging lines with spaces might break certain specialized formatting like code blocks. Testing with more complex text structures is recommended.server/distBE/api/auth.js (1)
1-5
: Improve imports and add proper error handling.The file imports an empty object from schema.js which is unnecessary and imports a config but only destructures a single property while other files destructure multiple properties from the same config.
import config from "../config.js" import { db } from "../db/client.js" import { getPublicUserAndWorkspaceByEmail } from "../db/user.js" -import {} from "../db/schema.js" const { JwtPayloadKey } = config;
server/distBE/ai/provider/base.js (4)
1-1
: Remove unused AWS SDK importThe AWS SDK Bedrock client is imported but not used anywhere in the code. Removing unused imports helps reduce bundle size and improves code clarity.
-import {} from "@aws-sdk/client-bedrock-runtime";
2-2
: Uncomment or remove commented codeThere's a commented import for
AIProviders
. Either uncomment it if it's needed or remove it completely. Commented code adds noise and can confuse future developers.-// import { AIProviders } from "../../ai/types";
12-23
: Parameter naming inconsistency in getModelParams methodThe method accepts parameters with snake_case (e.g.,
max_new_tokens
) but returns properties in camelCase (e.g.,maxTokens
). This inconsistency could lead to confusion when using the API.Consider standardizing on a single naming convention, preferably camelCase for JavaScript:
getModelParams(params) { return { - maxTokens: params.max_new_tokens || 512, - topP: params.top_p || 0.9, + maxTokens: params.maxTokens || params.max_new_tokens || 512, + topP: params.topP || params.top_p || 0.9, temperature: params.temperature || 0.6, modelId: params.modelId || defaultFastModel, systemPrompt: params.systemPrompt || "You are a helpful assistant.", userCtx: params.userCtx, stream: params.stream, json: params.json || null, }; }
19-19
: Add default value for userCtx parameterUnlike other parameters,
userCtx
doesn't have a default value. Consider providing a default (even if it'snull
) to maintain consistency with other parameters.- userCtx: params.userCtx, + userCtx: params.userCtx || null,server/distBE/db/client.js (1)
8-10
: Consider adding a connection pool with appropriate timeoutSetting
idle_timeout: 0
disables the connection timeout, which means connections will be kept open indefinitely. This could lead to database connection exhaustion if not properly managed. Consider using a connection pool with appropriate timeout settings.const queryClient = postgres(url, { - idle_timeout: 0, + idle_timeout: 30, // Close idle connections after 30 seconds + max: 10, // Maximum number of connections in the pool });server/distBE/api/proxy.js (2)
5-5
: Improve error message with specific detailsThe error message "Failed to fetch the image" is too generic and assumes the resource is an image, which might not always be the case. Provide more specific error information.
- return c.text("Failed to fetch the image.", 502); + return c.text(`Failed to fetch resource: ${response.status} ${response.statusText}`, 502);
11-11
: Consider validating or restricting content typesThe proxy currently accepts any content type, which could be a security risk. Consider validating or restricting the content types that can be proxied.
+ // Validate content type + const contentType = response.headers.get("Content-Type") || "application/octet-stream"; + const allowedContentTypes = ["image/jpeg", "image/png", "image/gif", "image/svg+xml", "application/pdf"]; + if (!allowedContentTypes.includes(contentType.split(";")[0])) { + return c.text(`Content type not allowed: ${contentType}`, 415); + } - "Content-Type": response.headers.get("Content-Type") || "application/octet-stream", + "Content-Type": contentType,server/distBE/utils/encryption.js (2)
1-14
: Encryption implementation looks solid but consider adding try/catch blocks.The class constructor properly validates the encryption key length (256-bit), which is critical for security. The error message is clear and helpful for debugging.
Consider adding more robust error handling around key validation:
constructor(key) { this.key = key; - if (key && - Buffer.from(key, this.encoding).byteLength !== - this.encryptionKeyByteLength) { - throw new Error("Encryption key must be base64-encoded and 256-bit long."); - } + if (!key) { + throw new Error("Encryption key is required."); + } + try { + if (Buffer.from(key, this.encoding).byteLength !== this.encryptionKeyByteLength) { + throw new Error("Encryption key must be base64-encoded and 256-bit long."); + } + } catch (error) { + throw new Error(`Invalid encryption key format: ${error.message}`); + } }
18-29
: Encrypt method is well-implemented but lacks input validation.The encryption method correctly implements AES-256-GCM with a random IV and authentication tag, which is a secure approach.
Add input validation to handle edge cases:
encrypt(str) { + if (typeof str !== 'string') { + throw new Error("Input must be a string."); + } const iv = crypto.randomBytes(12); const cipher = crypto.createCipheriv(this.algo, Buffer.from(this.key, this.encoding), iv); let enc = cipher.update(str, "utf8", this.encoding); enc += cipher.final(this.encoding); const authTag = cipher.getAuthTag().toString(this.encoding); return JSON.stringify({ ciphertext: enc, iv: iv.toString(this.encoding), authTag: authTag, }); }server/distBE/mime.js (1)
187-189
: Consider enhancing the MIME type lookup function.The function works correctly but could be improved with better error handling and documentation.
Add JSDoc and improve error handling:
/** * Returns the search keyword for a given MIME type * @param {string} mimeType - The MIME type to look up * @returns {string} The corresponding search keyword or empty string if not found */ -export const getSingleMimetypeKeyword = (mimeType) => { - return mimeTypeToSearchKeyword[mimeType] || ""; +export const getSingleMimetypeKeyword = (mimeType) => { + if (!mimeType || typeof mimeType !== 'string') { + return ""; + } + return mimeTypeToSearchKeyword[mimeType] || ""; };server/distBE/ai/mappers.js (1)
1-1
: Consider simplifying the import path.The import path could be simplified since you're already in the
ai
directory.-import { AIProviders, Models } from "../ai/types.js" +import { AIProviders, Models } from "../types.js"server/distBE/db/syncJob.js (3)
34-36
: Fix error message in updateSyncJob functionThe error message refers to "connector" instead of "sync job", which is misleading and inconsistent with the function's purpose.
- throw new Error("Could not update the connector"); + throw new Error("Could not update the sync job");
39-41
: Fix error message in updateSyncJob validationSimilar to the previous issue, this error message also incorrectly refers to "connector" instead of "sync job".
- throw new Error(`zod error: Invalid connector: ${parsedRes.error.toString()}`); + throw new Error(`zod error: Invalid sync job: ${parsedRes.error.toString()}`);
37-38
: Rename variable to match entity typeFor consistency and clarity, rename the variable to reflect that it's a sync job, not a connector.
- const [connectorVal] = updatedSyncJobs; - const parsedRes = selectSyncJobSchema.safeParse(connectorVal); + const [syncJobVal] = updatedSyncJobs; + const parsedRes = selectSyncJobSchema.safeParse(syncJobVal);server/distBE/config.js (2)
4-21
: Consider using environment variables for all configuration valuesWhile most configuration values are properly derived from environment variables, some values like port numbers are hardcoded. Consider making these configurable through environment variables as well.
- let port = 3000; + let port = process.env.PORT || 3000; - let host = "http://localhost:3000"; + let host = process.env.HOST || "http://localhost:3000"; if (process.env.NODE_ENV === "production") { postgresBaseHost = process.env.DATABASE_HOST; vespaBaseHost = process.env.VESPA_HOST; - port = 80; + port = process.env.PROD_PORT || 80; host = process.env.HOST; redirectUri = process.env.GOOGLE_PROD_REDIRECT_URI; }
97-136
: Consider organizing configuration into logical sectionsThe configuration object contains many properties. Consider organizing these into logical sections using nested objects for better readability and maintenance.
export default { + // Database configuration + database: { postgresBaseHost, vespaBaseHost, + }, + // Server configuration + server: { port, host, redirectUri, postOauthRedirect, + }, + // AI model configuration + models: { AwsAccessKey, AwsSecretKey, OpenAIKey, OllamaModel, TogetherAIModel, TogetherApiKey, FireworksAIModel, FireworksApiKey, GeminiAIModel, GeminiApiKey, defaultBestModel, defaultFastModel, isReasoning, fastModelReasoning, StartThinkingToken, EndThinkingToken, + }, // ... continue with other logical groupings };server/distBE/eval/qrelsToTrecFormat.js (1)
9-10
: Ensure imports and logger usage are consistent
You are callinggetLogger(Subsystem.Eval)
at line 9 but only importing bothgetLogger
andSubsystem
at lines 25-26. Consider moving the imports so they appear before usage, and consistently use eitherrequire
or ESimport
statements throughout the file.Also applies to: 25-26
server/distBE/db/workspace.js (2)
25-36
: Use transaction object or remove it from the signature
You’re passingtrx
into functions likegetWorkspaceById
, but callingdb.select()
instead oftrx.select()
. For transactional consistency, make sure to use the transaction object or remove it from the function arguments if not needed.
38-50
: Fix misleading error message
IngetWorkspaceByCreatedBy
, the error thrown is "Could not find workspaces by domain" instead of a message indicating the missing workspace for the createdBy field. Update this message to correctly reflect the lookup failure.server/distBE/ai/provider/fireworks.js (1)
1-1
: Remove unused import
The import statementimport {} from "@aws-sdk/client-bedrock-runtime";
is unused. Consider removing it to keep the file clean.server/distBE/db/customType.js (1)
7-35
: Consider handling encryption errors and special cases.If
encryption.encrypt
orencryption.decrypt
throws an error or returns unexpected values, it could break the ORM logic. It might be useful to wrap these calls in try/catch or validate the input/output to ensure robust error handling for malicious or corrupted data.server/distBE/eval/evaluate.js (2)
6-14
: Provide more robust CLI argument parsing.The current checks ensure basic presence of arguments, but do not handle potential mismatched or partial arguments (e.g., missing parameter after
--output
). Consider using a dedicated argument parser or safer indexing to avoid out-of-bounds array access.
36-36
: Recommend parameterizing concurrency and queue usage.Currently, concurrency is set to 10. For large or small datasets, dynamically adjusting concurrency (e.g., via an environment variable) can optimize performance. Also, confirm the queue completes successfully before writing results (line 65 is correct usage, but remain aware of any pending asynchronous tasks).
Also applies to: 64-65
server/distBE/ai/provider/ollama.js (2)
14-16
: Clarify error handling and response logic inconverse
.A single generic error message ("Failed to get response from Ollama") may make debugging difficult. Consider logging the exact error content or message for diagnosis while preserving security best practices.
Also applies to: 31-39
64-70
: Ensure meaningful streaming structures forconverseStream
.The chunked stream approach is valid, but if partial errors occur mid-stream, the logic might terminate unexpectedly. Consider more granular error recovery or a fallback message for partial responses to improve resilience.
Also applies to: 72-75
server/distBE/logger/index.js (1)
1-2
: Remove unused imports.The
levels
import from "pino" and the importedSubsystem
from "../types.js" are not referenced, which is dead code that can be removed to keep the module clean.-import { levels, pino } from "pino"; +import { pino } from "pino"; -import { Subsystem } from "../types.js"server/distBE/db/seed.js (3)
15-25
: Warning: Repeated usage of@ts-ignore
.You’re silencing TypeScript errors when inserting data. Consider extending or refining your schema definitions in
workspaces
so that manual ignores become unnecessary and type safety is preserved.
39-39
: Fix string interpolation for log messages.Currently, there's a stray quote after "Inserted User:". This can create confusing logs. Consider passing the user object as a separate parameter or using a cleaner string format:
- Logger.info(`Inserted User:', ${user}`); + Logger.info(`Inserted User: ${JSON.stringify(user)}`);
48-49
: Avoid forcibly exiting after the async call completes.
process.exit(0)
might terminate the process prematurely if there are still pending tasks or open connections. Ensure all resources (e.g., DB connections) are safely closed before ending the process, or let Node exit naturally.Would you like me to propose a solution that gracefully closes pending connections and exits once all tasks are complete?
server/distBE/search/mappers.js (3)
36-37
: Remove redundant property assignments.Lines such as
child.fields.chunks_summary = child.fields.chunks_summary;
have no effect. Consider removing these no-op lines to keep the code more concise and readable.Also applies to: 50-51, 60-61
25-78
: Reduce repeated logic for settingtype
andrelevance
.Assigning
child.fields.type
andchild.fields.relevance
in each conditional block is repetitive. A small helper function could streamline this code by setting shared properties before applying the specific schema.
79-138
: Consider unifying search and autocomplete logic.
VespaSearchResponseToSearchResult
andVespaAutocompleteResponseToResult
share similar patterns of type assignment and parsing. Extracting common patterns into a shared utility could reduce duplication and ease maintenance.server/distBE/api/oauth.js (4)
20-23
: Consider returning more detailed error information.Throwing an HTTP 500 response on missing
state
may not be very descriptive. It can be useful to log or return a clear message indicating the missing query parameters.
31-32
: Improve error handling to distinguish invalid or expired cookie.Currently, the code throws a 500 status if
random
does not matchstateInCookie
. You might consider returning a 403 or a 401 to reflect an invalid session or a potential CSRF scenario.
37-39
: Handle non-GoogleDrive apps without code verifier more gracefully.If the code verifier is missing, the logic only checks for
app === Apps.GoogleDrive
. Consider anticipating future OAuth expansions by handling this scenario elegantly for all providers.
67-73
: Evaluate transactional consistency.If the queue job is created but the connector update fails (or vice versa), you risk inconsistent states. Wrapping these calls inside a transactional flow (if supported) could help maintain data integrity.
server/distBE/db/oauthProvider.js (3)
8-23
: Wrap insertion in a transaction, if necessary.While you are catching insertion errors, consider ensuring a more comprehensive transactional approach if other related database operations occur in tandem. This ensures consistency if one operation fails.
24-36
: Unify error messages."Could not get the connector" suggests a missing resource, but the function name is
getOAuthProvider
. For clarity, you might want to unify error messages to reflect the entity type (“provider” vs “connector”).
37-49
: Return a single provider object or handle array consistently.
getOAuthProviderByConnectorId
returns an array inres
, but onlyres[0]
is relevant. Consider returning a single object for consistency, or clarify any scenario requiring multiple providers.server/distBE/eval/processDataForVespa.js (4)
6-8
: Validate arguments more robustly.You only check for the overall length of the arguments array. Consider verifying each argument for correctness (e.g., ensure the next token isn’t missing) in case users provide malformed flags or no values.
27-28
: Import statements can be consolidated.You are importing
fs
andchunkDocument
separately after runtime argument checks. Consider grouping your imports at the top to streamline the code structure.
60-62
: Consider concurrency or streaming approach for large data.Writing large batches to one file works, but you might want to parallelize or pipeline the process if input size grows significantly. This can reduce memory usage on very large files.
78-82
: Monitor partial writes for reliability or duplication.If an unexpected error occurs after some writes, it may be unclear which documents were successfully committed. You could store a checkpoint or track progress to avoid partially processed data.
server/distBE/chunks.js (1)
16-17
: Consider replacing illegal characters with a placeholder.Currently, illegal code points are removed, which may unintentionally merge words. Replacing them with a placeholder (e.g. a space) could improve readability.
server/distBE/db/message.js (1)
45-48
: Refine the error message to accurately reflect the missing entity.Throwing "Chat not found" here might be misleading. Consider changing it to "Message not found" for clarity.
- throw new Error("Chat not found"); + throw new Error("Message not found");server/distBE/ai/provider/bedrock.js (2)
1-10
: Consider verifying AWS configurations and error handling for Bedrock client.
The code imports the Bedrock Runtime client but does not appear to validate region or credential configuration. To prevent runtime errors, ensure that your environment or config file consistently provides valid AWS settings.
48-129
: Consider splitting out chunk-handling logic inconverseStream
.
The streaming logic is wrapped into a single large method, which may become harder to maintain. Extracting chunk processing into a helper function can reduce complexity and improve readability.server/distBE/server.js (3)
81-95
: Refine AuthRedirect error handling.
TheAuthRedirect
middleware warns and redirects when no auth token is found or the token is invalid. Consider distinguishing these scenarios in the UI or logs for clearer debugging and user experience.
100-125
: Check route definitions for naming consistency.
API endpoints such as/chat
,/chat/bookmark
, etc. are logically named, though you might consider a consistent pluralization or hierarchical pattern for clarity (e.g.,/chats/:id/bookmark
).
149-167
: Evaluate stricter validation for JWT payloads.
When generating tokens ingenerateToken
, consider adding audience and issuer claims or verifying scopes to enhance security. This can include input validation onrole
, andworkspaceId
.server/distBE/integrations/google/gmail/quote-parser.js (1)
67-155
: Consider performance testingparseEmailBody
.
Parsing and reversing each line can become expensive for very large emails. If performance becomes a concern, you might streamline or chunk the processing to avoid memory overhead.server/distBE/integrations/google/pdf-utils.js (3)
10-27
: Consider handling the case when the file does not exist.
If the file is already missing (e.g.,ENOENT
), the current logic will throw an error. You could optionally check forerr.code === "ENOENT"
and log a warning instead of throwing, if that case is non-critical.} catch (err) { + if (err.code === "ENOENT") { + Logger.warn(`File not found at ${filePath}, no action needed`); + return; + } Logger.error( err,
28-49
: Consider cleaning up partially written files on errors.
In the event of a write error, you might end up with a partially written file. Deleting the partial file helps ensure data consistency and reduces disk clutter.
50-68
: Enhance password-protected PDF handling.
Currently, password-protected PDFs are skipped altogether. This may be acceptable, but if you need partial data extraction or user prompts for passwords, consider implementing a fallback method or a prompt-based approach.server/distBE/ai/provider/openai.js (2)
11-41
: Explicit error handling for network or API failures is missing.
Consider wrapping thethis.client.chat.completions.create
call in atry...catch
block to handle potential API or network errors.async converse(messages, params) { const modelParams = this.getModelParams(params); - const chatCompletion = await this.client.chat.completions.create({ + let chatCompletion; + try { + chatCompletion = await this.client.chat.completions.create({ // ... + }); + } catch (apiError) { + // Graceful error logging or rethrow + throw apiError; + } // ... }
42-62
: Ensure documentation for streaming parameters.
It would be beneficial to document howstream_options
(line 58) impacts the response structure, especially if other maintainers need to customize streaming behavior in the future.server/distBE/utils/retry-async.js (2)
1-10
: Default retry limit may cause infinite loops.
Withretry
set toNumber.POSITIVE_INFINITY
, failures can loop indefinitely. Consider using a finite default or an exponential backoff strategy to avoid potential starvation.
11-21
: Refactor to support jitter or exponential backoff for robust retry.
Adding a random or exponential delay helps mitigate load spikes and improves reliability under frequent failures.server/distBE/ai/provider/together.js (1)
38-39
: Preserve original error details for debugging.Re-throwing a generic error hides the original message and stack trace. Consider appending the original error to help diagnose issues:
} catch (error) { - throw new Error("Failed to get response from Together"); + throw new Error(`Failed to get response from Together: ${error.message}`); }server/distBE/integrations/google/config.js (1)
17-21
: Consider making concurrency values configurable.Storing concurrency values (2, 3, etc.) in environment variables or a separate config file could make them easier to adjust without code changes. This helps with scaling and fine-tuning performance in different environments.
server/distBE/api/admin.js (3)
18-28
: Validate error conditions more thoroughly.In
GetConnectors
, you throwNoUserFound
when no user is retrieved, which is correct. However, consider including more contextual data (e.g., the email associated with the request) in the error message to aid in debugging.
59-78
: Ensure correct query parameter validation.Line 68 uses
@ts-ignore
when readingapp
fromc.req.valid("query")
. This could mask potential runtime issues ifapp
is missing or typed incorrectly. Consider refining the typing or providing fallback logic for missing or invalid query parameters.
38-41
: Include original error details when re-throwing.Line 39 re-throws a new
NoUserFound
error with empty curly braces. Adding the original error context (e.g., reason or stack trace) could be beneficial for debugging.server/distBE/integrations/google/sync.js (5)
66-80
: Clarify spreadsheet fallback logic.
WhengetDocumentOrNull
fordocId
returnsnull
, this function checks onlydocId_0
as a fallback. If spreadsheets can have indexes beyond_0
, it might miss additional sections.
356-670
: Function is too large: break into smaller processes.
handleGoogleOAuthChanges
processes Drive, Contacts, and Calendar updates in one go. Splitting these responsibilities into separate functions (e.g.,syncDriveChanges
,syncContacts
,syncCalendarEvents
) would enhance maintainability.
671-717
: Ensure partial failures are handled.
insertEventIntoVespa
logs errors and continues, which could cause silent data corruption if some fields fail to insert. Consider returning error information to the caller or storing partial failures for later diagnosis.
718-864
: Check large event merges.
handleGoogleCalendarEventsChanges
merges numerous events in a loop. If many events are updated simultaneously, performance might degrade. Consider batching or compressing updates in larger sets to reduce overhead.
998-1034
: Detect duplicate contact records.
syncContacts
inserts or deletes contacts. If multiple sync runs overlap or a contact reappears, duplicates may occur. Adding a deduplication check (e.g., byresourceName
) is recommended.server/distBE/integrations/google/gmail-worker.js (2)
73-146
: Ingestion approach is clear but watch out for forwarded duplicates.
handleGmailIngestion
processes all messages in multiple pages. Consider skipping duplicates if a message reappears. For large mailboxes, also confirm that memory usage from parallel operations is acceptable.
147-170
: Regex approach for parsing email addresses.
extractEmailAddresses
uses a simple regex for angle-bracketed addresses. This works in most cases, but some email formats break it (e.g., quotes or unusual formatting). Comprehensive email parsing libraries can reduce edge-case risk.server/distBE/integrations/google/gmail/index.js (2)
85-108
: Edge cases in email extraction.
extractEmailAddresses
must handle commas inside quotes or nested angle brackets. If advanced email address formats are encountered (e.g., “Doe, John [email protected]”), the function might produce partial or invalid addresses.
233-262
: Consider partial HTML to text conversions.
getBody
attempts to convert entire HTML parts withhtmlToText
. If an email body is extremely large or malformed HTML, partial or incremental conversion might reduce parse overhead.server/distBE/ai/provider/fireworksClient.js (2)
46-70
: Consider adding jitter to your retry logic.In
complete
, you're using exponential backoff withMath.pow(2, retries) * 1000
. Adding a small randomized jitter (similar to the approach in your other code) would help reduce the risk of synchronized retries across multiple instances.- await new Promise((resolve) => setTimeout(resolve, Math.pow(2, retries) * 1000)); + const baseDelay = Math.pow(2, retries) * 1000; + const jitter = Math.floor(Math.random() * 500); + await new Promise((resolve) => setTimeout(resolve, baseDelay + jitter));
127-153
: Remove or mask sensitive keys in examples.In the
example()
function, ensure you do not accidentally commit real API keys or credentials. Replacing"your-api-key"
with environment variables or secure placeholders prevents accidental credential leaks.server/distBE/utils.js (1)
29-40
: Consider supporting more progress statuses.
progress_callback
currently ignores any status besides"progress"
. If future progress types or statuses are introduced (e.g.,"queued"
,"done"
), consider handling them explicitly or logging them for debugging.server/distBE/ai/context.js (2)
64-90
: Event context includes robust fields.
constructEventContext
thoroughly enumerates event fields and times. Minor note: a fallback for invalid or missing dates might be warranted. IfstartTime
orendTime
is not a valid Date, the.toUTCString()
call could produceInvalid Date
.
138-163
: Colored context repetition.
constructFileColoredContext
,constructUserColoredContext
, andconstructMailColoredContext
largely mirror the non-colored versions. If you anticipate significant changes, consider an abstraction that just colorizes the relevant fields to reduce duplication.server/distBE/types.js (1)
183-185
: Naming mismatch in OperationStatus enum.
Using"Pendings"
as the enum key but assigning it the value"Pending"
may confuse future maintainers.- OperationStatus["Pendings"] = "Pending"; + OperationStatus["Pending"] = "Pending";server/distBE/db/connector.js (3)
21-24
: Avoid variable shadowing forconfig
.
The function parameterconfig
conflicts with the importedconfig
at line 1. Rename one of them to prevent confusion.-export const insertConnector = async (trx, workspaceId, userId, workspaceExternalId, name, type, - authType, app, config, credentials, subject, oauthCredentials, status) => { +export const insertConnector = async (trx, workspaceId, userId, workspaceExternalId, name, type, + authType, app, connectorConfig, credentials, subject, oauthCredentials, status) => { ... }
93-103
: Handle missing or malformedaccessTokenExpiresAt
.
IsTokenExpired
assumestoken.accessTokenExpiresAt
is valid. If it's absent or invalid, this function may yield false positives or crash.+ if (!tokens.accessTokenExpiresAt) { + Logger.warn("Missing 'accessTokenExpiresAt' field, defaulting to expired token logic"); + return true; + }
125-153
: Potential concurrency concern in token refresh.
Multiple parallel requests could simultaneously detect an expired token and attempt refresh. Consider implementing a lock or short-lived in-memory marker to prevent repeated refresh attempts.server/distBE/queue/index.js (1)
71-73
: Duplicate scheduling call may lead to confusion.
setupServiceAccountCronjobs()
is also called on line 73, meaning some schedules might overlap. If intentional, consider clarifying the approach.server/distBE/db/user.js (5)
1-2
: Remove unused import [db].
It appearsdb
is imported but never utilized within this file, which adds unnecessary overhead.- import { db } from "./client.js"
32-48
: Consider consistent schema validation.
UnlikegetPublicUserAndWorkspaceByEmail
, this function does not parse the returned data with schemas. If schema validation is also needed here, consider applying the same approach for consistency.
49-59
: Handle absent user/workspace scenario.
The function returns a result array but does not throw an error if it's empty. For consistency with other methods, consider throwing a 404 or returningnull
if no record is found.
61-67
: Add not-found handling and schema validation.
Currently, it returns an array that might be empty. Consider throwing an error or returning a specific response if no user is found, and optionally parse the data with the schema if needed.
68-88
: Validate inserted user.
After inserting the user, the code returns the raw database result. For consistency and error handling, consider parsing the returned row withselectUserSchema
and possibly verifying uniqueness constraints or handling collisions.server/distBE/db/chat.js (6)
21-28
: Check pagination or performance for large chat lists.
getWorkspaceChats
retrieves all chats in descending order byupdatedAt
. For large workspaces, consider adding pagination or limiting results to optimize performance.
36-45
: Code style consistency.
This function is nearly identical togetChatById
. Both throw “Chat not found.” Just ensure it’s intentional to keep them separate or consider DRY approaches if they share logic.
46-57
: Include schema validation after update.
After updating, consider validating the returned row withselectChatSchema
to ensure data integrity, similar to how new or fetched records are validated.
58-67
: Soft-delete vs. permanent deletion.
Here you permanently remove a record. If you use soft-deletes elsewhere (e.g.,deletedAt
), consider a consistent approach.
68-77
: Better error message for missing messages.
Throwing “Messages not found” might be ambiguous if some but not all messages exist. If partial deletion is possible, consider clarifying or returning how many records were actually deleted.
90-99
: Limit or paginate public chats.
getPublicChats
offers pagination arguments (pageSize
,offset
), which is great. Verify correctness of those parameters, including potential edge cases ifpageSize
oroffset
are undefined or zero.server/distBE/ai/types.js (7)
1-3
: Unused import from AWS SDK.
No references to@aws-sdk/client-bedrock-runtime
appear below. If you’re planning to integrate with Bedrock in future code, that’s fine; otherwise, consider removing it to keep dependencies minimal.
13-35
: Potential version confusion in model names.
Model identifiers like"meta.llama3-1-70b-instruct-v1:0"
might lead to confusion if updated frequently. Document changes or handle versioning in an external config if these evolve often.
36-42
: QueryCategory usage.
You have categories likeSelf
,InternalPerson
, etc. If logic differs significantly between categories, consider adding clarifying docs or typed structures for each.
44-49
: Expanded query types.
Future expansions (e.g., “RetrieveMetadata”) are commented out. Confirm if these will be reintroduced or if they should be removed for clarity.
60-65
: SearchAnswerResponse with partial data.
Your code allowsanswer
andcitations
to be nullable. Ensure your UI or consumer code handles these nulls gracefully.
73-80
: Default handling for count.
You preprocesscount
with a default of 5. Verify thatcount
cannot become negative or an excessively large number that might degrade performance.
98-101
: Context rank expansions.
QueryContextRank
enforces a boolean and an array. If more fields become necessary (like confidence scores), keep the structure flexible or documented for expansions.server/distBE/api/search.js (5)
1-45
: Well-organized imports, but consider reorganizing for better clarity.The import section is quite large and mixes different types of imports (utilities, database operations, AI functions, etc.). Consider organizing imports by category or functionality for better readability.
// AI-related imports import { analyzeQueryForNamesAndEmails, analyzeQueryMetadata, askQuestion, } from "../ai/provider/index.js" import { answerContextMap, answerMetadataContextMap, cleanContext, userContext, } from "../ai/context.js" import { QueryCategory } from "../ai/types.js" // Tokenizers import llama3Tokenizer from "llama3-tokenizer-js" import { encode } from "gpt-tokenizer" // Search operations import { autocomplete, deduplicateAutocomplete, groupVespaSearch, searchVespa, searchUsersByNamesAndEmails, getTimestamp, insert, GetDocument, UpdateDocument, updateUserQueryHistory, } from "../search/vespa.js" import { VespaAutocompleteResponseToResult, VespaSearchResponseToSearchResult, } from "../search/mappers.js" import { userQuerySchema, userSchema } from "../search/types.js" // Database operations import { getPublicUserAndWorkspaceByEmail } from "../db/user.js" import { db } from "../db/client.js" // Utilities and configuration import { z } from "zod" import config from "../config.js" import { HTTPException } from "hono/http-exception" import { AnswerSSEvents } from "../shared/types.js" import { streamSSE } from "hono/streaming" import { getLogger } from "../logger/index.js" import { Subsystem } from "../types.js" import { getErrorMessage } from "../utils.js"
92-95
: Consider handling empty results more gracefully.When
autocomplete
returns no results, you're returning an object with an empty children array. It would be better to return a consistent structure that matches whatVespaAutocompleteResponseToResult
would return.let results = await autocomplete(query, email, 5) if (!results) { - return c.json({ children: [] }) + return c.json({ results: [] }) }
99-105
: Good error handling, but could be more specific.The error handling is good, but it returns a generic error message regardless of the type of error. Consider providing more specific error messages based on the error type.
} catch (error) { const errMsg = getErrorMessage(error) Logger.error(error, `Autocomplete Error: ${errMsg} ${error.stack}`) throw new HTTPException(500, { - message: "Could not fetch autocomplete results", + message: error instanceof HTTPException + ? errMsg + : "Could not fetch autocomplete results", }) }
158-158
: Address the TODO comment in the code.The TODO comment indicates a need to deduplicate results for Google admin and contacts. Consider implementing this functionality or creating a ticket to track this work.
I can help implement the deduplication logic for Google admin and contacts. Would you like me to create a detailed implementation or open a new issue to track this work?
166-167
: Remove TypeScript ignore comments and fix type issues properly.Similar to the previous case, the
@ts-ignore
comment is used to bypass TypeScript type checking. Fix this properly with appropriate type annotations.- // @ts-ignore - const { query, app, entity } = c.req.valid("query") + const { query, app, entity } = c.req.valid("query") as { + query: string, + app?: string, + entity?: string + }server/distBE/integrations/google/tracking.js (7)
2-9
: Consider using TypeScript enums instead of JavaScript IIFE for better type safety.You're using an immediately-invoked function expression (IIFE) to define an enum-like object. Modern TypeScript has better enum syntax that provides additional type safety.
-export var StatType; -(function (StatType) { - StatType["Gmail"] = "gmailCount"; - StatType["Drive"] = "driveCount"; - StatType["Contacts"] = "contactsCount"; - StatType["Events"] = "eventsCount"; - StatType["Mail_Attachments"] = "mailAttachmentCount"; -})(StatType || (StatType = {})); +export enum StatType { + Gmail = "gmailCount", + Drive = "driveCount", + Contacts = "contactsCount", + Events = "eventsCount", + Mail_Attachments = "mailAttachmentCount" +}
11-19
: Consider adding TypeScript interfaces for tracker objects.The tracker objects would benefit from proper TypeScript interfaces to define their structure and ensure type safety throughout the codebase.
+interface UserStats { + gmailCount: number; + driveCount: number; + contactsCount: number; + eventsCount: number; + mailAttachmentCount: number; + done: boolean; + startedAt: number; + doneAt: number; + type: AuthType; + totalMail: number; + totalDrive: number; +} + +interface ServiceAccountTracker { + totalUsers: number; + completedUsers: number; + userStats: Record<string, UserStats>; +} + +interface OAuthTracker { + user: string; + userStats: Record<string, UserStats>; +} // Global tracker object -export const serviceAccountTracker = { +export const serviceAccountTracker: ServiceAccountTracker = { totalUsers: 0, completedUsers: 0, userStats: {}, }; -export const oAuthTracker = { +export const oAuthTracker: OAuthTracker = { user: "", userStats: {}, };
21-52
: Add email parameter type and consider handling existing stats differently.The function doesn't specify the type for the email parameter, and it doesn't handle the case where stats exist but may need to be reset.
-const initializeUserStats = (email) => { +/** + * Initialize user statistics for both service account and OAuth trackers + * @param {string} email The user's email address + */ +const initializeUserStats = (email: string) => { if (!serviceAccountTracker.userStats[email]) { serviceAccountTracker.userStats[email] = { gmailCount: 0, driveCount: 0, contactsCount: 0, eventsCount: 0, mailAttachmentCount: 0, done: false, startedAt: new Date().getTime(), doneAt: 0, type: AuthType.ServiceAccount, totalMail: 0, totalDrive: 0, }; } if (!oAuthTracker.userStats[email]) { oAuthTracker.userStats[email] = { gmailCount: 0, driveCount: 0, contactsCount: 0, eventsCount: 0, mailAttachmentCount: 0, done: false, startedAt: new Date().getTime(), doneAt: 0, type: AuthType.OAuth, totalMail: 0, totalDrive: 0, }; } };
53-57
: Add type annotations and validate inputs.The
updateUserStats
function should have type annotations for its parameters and should validate inputs before using them.-export const updateUserStats = (email, type, count) => { +/** + * Update user statistics for a specific statistic type + * @param {string} email The user's email address + * @param {StatType} type The statistic type to update + * @param {number} count The count to add to the statistic + */ +export const updateUserStats = (email: string, type: StatType, count: number) => { + if (!email || typeof email !== 'string') { + throw new Error('Email is required and must be a string'); + } + if (!Object.values(StatType).includes(type)) { + throw new Error(`Invalid stat type: ${type}`); + } + if (typeof count !== 'number' || isNaN(count)) { + throw new Error('Count must be a valid number'); + } initializeUserStats(email); serviceAccountTracker.userStats[email][type] += count; oAuthTracker.userStats[email][type] += count; };
58-64
: Add type annotations and use constants for magic values.The
updateTotal
function should have type annotations and should use constants for thetotalMail
andtotalDrive
fields.+/** + * Property names for the total counts + */ +export enum TotalCountType { + Mail = "totalMail", + Drive = "totalDrive" +} -export const updateTotal = (email, totalMail, totalDrive) => { +/** + * Update total mail and drive counts for a user + * @param {string} email The user's email address + * @param {number} totalMail The total mail count + * @param {number} totalDrive The total drive count + */ +export const updateTotal = (email: string, totalMail: number, totalDrive: number) => { + if (!email || typeof email !== 'string') { + throw new Error('Email is required and must be a string'); + } + if (typeof totalMail !== 'number' || isNaN(totalMail)) { + throw new Error('Total mail must be a valid number'); + } + if (typeof totalDrive !== 'number' || isNaN(totalDrive)) { + throw new Error('Total drive must be a valid number'); + } initializeUserStats(email); - serviceAccountTracker.userStats[email].totalMail = totalMail; - serviceAccountTracker.userStats[email].totalDrive = totalDrive; - oAuthTracker.userStats[email].totalMail = totalMail; - oAuthTracker.userStats[email].totalDrive = totalDrive; + serviceAccountTracker.userStats[email][TotalCountType.Mail] = totalMail; + serviceAccountTracker.userStats[email][TotalCountType.Drive] = totalDrive; + oAuthTracker.userStats[email][TotalCountType.Mail] = totalMail; + oAuthTracker.userStats[email][TotalCountType.Drive] = totalDrive; };
65-71
: Add validation and add completion timestamp handling.The
markUserComplete
function should check if the user stats exist before trying to mark them as complete, and should consider edge cases like re-marking a user as complete.-export const markUserComplete = (email) => { +/** + * Mark a user as complete in the service account tracker + * @param {string} email The user's email address + */ +export const markUserComplete = (email: string) => { + if (!email || typeof email !== 'string') { + throw new Error('Email is required and must be a string'); + } + + // Initialize if needed + initializeUserStats(email); + if (!serviceAccountTracker.userStats[email].done) { serviceAccountTracker.userStats[email].done = true; serviceAccountTracker.userStats[email].doneAt = new Date().getTime(); serviceAccountTracker.completedUsers++; } };
79-81
: Add type annotation for the email parameter.The
setOAuthUser
function should have a type annotation for the email parameter.-export const setOAuthUser = (mail) => { +/** + * Set the email of the OAuth user being tracked + * @param {string} mail The user's email address + */ +export const setOAuthUser = (mail: string) => { + if (!mail || typeof mail !== 'string') { + throw new Error('Email is required and must be a string'); + } oAuthTracker.user = mail; };server/distBE/ai/eval.js (4)
608-609
: Uncomment or remove the commented return statement.There's a commented line
// return answer
which suggests that the function should return a value but currently doesn't. Either uncomment this line or remove it if not needed.- // return answer + return answer || "I don't know";
619-623
: Remove commented code blocks.There's a commented-out code block that appears to be unused. Either implement the functionality or remove the commented code to keep the codebase clean.
- // const formattedMessages = messages ? - // messages.map((msg, index) => ({ - // role: "user", - // content: [{ text: (msg as Data).input }] - // })) : [];
631-631
: Implement proper cost tracking instead of hardcoded values.The function uses a hardcoded cost value of 0.001, which doesn't accurately reflect the actual cost of the operation. Consider implementing proper cost tracking.
- costArr: [0.001], + costArr: [0.001], // TODO: Implement proper cost tracking based on actual API usage
636-636
: Uncomment or remove the commented function call.There's a commented function call
// endToEndFactual()
at the end of the file. Either uncomment it if it's meant to be executed or remove it if it's not needed.-// endToEndFactual() +// Uncomment to run the evaluation +// endToEndFactual() + +// Export the evaluation functions for external use +export { + basicJsonRag, + basicRag, + iterativeJsonRag, + iterativeTimeExpansionJsonRag, + iterativeWithTimeFilterAndQueryRewrite, + pointEventQueryTimeExpansion, + endToEndFactual +};server/distBE/integrations/google/utils.js (2)
14-15
: Consider removing unused importMAX_ATTACHMENT_PDF_SIZE
.
MAX_ATTACHMENT_PDF_SIZE
is imported but never used in this file. Removing unused imports can help keep the code clean.- MAX_ATTACHMENT_PDF_SIZE, MAX_GD_PDF_SIZE,
75-97
: Re-examine the decision to returnnull
instead of throwing or propagating errors.While returning
null
ensures the sync engine continues, it can make debugging more difficult if callers disallow nulls or expect an error. Consider throwing a custom error or returning a defined error object so that upstream processes can differentiate between genuine "no file" scenarios and transient failures.server/distBE/api/chat.js (4)
103-124
: Refine error messages for debugging clarity.
GetChatApi
returns a generic "Could not fetch chat and messages" message in case of failure. Providing additional context or debug logs (e.g., chatId) can help with troubleshooting.
192-241
: Ensure robust error handling for unexpected entity types.
searchToCitation
throws if it encounters an unknownsddocname
. If the system can encounter new or unrecognized doc types, consider logging or default handling instead of throwing.
693-775
: Maintain consistent error handling inMessageApi
.The function gracefully handles several error scenarios, but a thrown synchronous HTTPException might bypass the SSE streaming flow. Ensure that all potential errors are handled uniformly (e.g., streaming an SSE "Error" event).
952-1001
: Consider logging newly inserted messages for auditing.When an answer is successfully created and stored, logging the
msg.externalId
or a truncated portion of the message can help trace user interactions.server/distBE/shared/types.js (2)
60-66
: ExtendOpenAIError
handling for other scenarios.Currently, only
RateLimitError
andInvalidAPIKey
are defined. If the system can handle other error types from OpenAI (e.g., network issues), consider expanding the enum or implementing a catch-all.
203-213
: Confirm usage ofChatSSEvents.Error
vs. other error channels.Multiple modules define error or SSE events. Consistency in naming and usage across
ChatSSEvents.Error
or newly added events for different error subcategories helps maintain clarity.server/distBE/db/schema.js (1)
110-110
: Unresolved TODO
A comment indicates additional fields (such as token expiration fields) are planned. Ensure these are tracked and implemented if needed for OAuth connectors.Would you like help adding these fields and preparing a migration strategy?
server/distBE/search/vespaClient.js (1)
87-105
: Consolidate duplicate insert logic
BothinsertDocument
andinsert
methods perform similar operations. Consider merging or refactoring them into a single, configurable method to reduce code duplication and simplify maintenance.server/distBE/search/vespa.js (2)
23-23
: HardcodedNAMESPACE
const NAMESPACE = "namespace"; // Replace with your actual namespace
may cause confusion in production if forgotten. Consider loading from config or environment variables to avoid accidental usage of a placeholder.
540-540
: Typographical error in function name
vespa.getUsersByNamesAndEmaisl
has a misspelling ("Emaisl" vs. "Emails"). Consider correcting for consistency.server/distBE/ai/provider/index.js (4)
37-37
: Typographical error in variable name
It looks like "togetherProvidder" has a double "d" in its name. This might lead to confusion or inadvertent errors.- let togetherProvidder = null; + let togetherProvider = null;
40-88
: Consider concurrency safety and repeated initialization
TheinitializeProviders()
function sets a flag (providersInitialized
) to prevent multiple initializations, but if it is called in parallel before the flag is set, multiple providers could still be created. Consider adding synchronization or clarifying initialization order to ensure thread safety in a concurrent environment.
146-169
: Lack of fallback or retry mechanism
TheaskQuestion
function attempts to stream a conversation and throws on any error without retries or fallback. In certain error scenarios (e.g., transient network failures), a simple retry could improve resilience. Consider adding a retry strategy or fallback for robust error handling.
597-626
: Check for consistent naming or usage
In thebaselineRAGJson
function, the response property is called"output"
whereas other similarly structured functions (e.g.,askQuestionWithCitations
) name it differently. For consistency and maintainability, consider standardizing the JSON response structure across these streaming or analysis functions.server/distBE/ai/prompts.js (1)
180-194
: Ambiguity with rewriting personal queries
The instructions inrewriteQuerySystemPrompt
mention replacing first-person pronouns with the user’s name or email. Verify that it is desired behavior to potentially expose an internal user name or email in logs or queries for all contexts. This might inadvertently reveal personal information.server/distBE/errors/index.js (1)
2-5
: Inconsistent naming for enum entries
DbOp["Create"]
uses PascalCase for the enum string, whileDbOp["READ"]
uses uppercase. This inconsistency could create confusion or bugs. Consider standardizing the enum value naming (e.g.,Create
,Read
,Update
,Delete
).- DbOp["Create"] = "Create"; - DbOp["READ"] = "Read"; + DbOp["Create"] = "Create"; + DbOp["Read"] = "Read";server/distBE/integrations/google/index.js (6)
112-140
: Handle errors for partial retrievals more seamlessly.Within
listUsersByEmails()
, an error during user retrieval logs a warning and continues iterating, which is great for graceful handling. However, consider an optional aggregated error report for skipped users. This helps in post-retrieval analysis of any recurring issues (e.g. misconfiguration, invalid emails).for (const email of emails) { try { const res = await retryWithBackoff(() => admin.users.get({ userKey: email }), - ) + ); users.push(res.data); } catch (error) { Logger.warn(`User ${email} not found: ${error}`); + // Optional: keep track of failed lookups + // skippedEmails.push({ email, errorMsg: error.toString() }); } }
173-257
: Validate concurrency or error handling for multiple sync jobs.The
syncGoogleWorkspace()
method assumes there's exactly one sync job in the database. If your usage grows, or if a second job is triggered accidentally, it throws an error and fails the job. Ensure that the logic for concurrency and job scheduling is well-documented or well-tested to manage potential collisions or overlapping sync tasks as adoption scales.
522-658
: Ensure OAuth token refresh logic is robust.
handleGoogleOAuthIngestion()
sets the credentials once, but if the job runs longer than the token’s lifetime, the token may expire. Consider proactively refreshing or verifying the token’s validity mid-flow to avoid partial ingestions and re-tries.
714-898
: Improve partitioning logic for large user sets.
handleGoogleServiceAccountIngestion()
concurrently processes each user in an entire workspace. If a workspace has thousands of users, it might still be heavy for upstream resources. Consider partitioning the users or using dynamic concurrency limits based on resource usage to optimize throughput and avoid potential throttling by Google APIs.
1453-1528
: Investigate possible memory overhead on PDF ingestion.
googlePDFsVespa()
downloads and processes PDFs concurrently withp-limit
. Large or numerous PDFs may still stress memory if processed in parallel. Consider logging memory usage or restricting concurrency further if ephemeral memory spikes are observed in production.
1749-1780
: Refine or parameterize file query filters.
listFiles()
uses a simpleq: "trashed = false"
to list non-trashed files. Consider parameterizing the search query to exclude large or irrelevant files (e.g., AI Studio folders, archived items, app profiles) if known up front. This can save bandwidth and processing time.server/distBE/search/types.js (3)
208-213
: Correct typographical error in "EventAtatchment."The interface is named
EventAtatchment
but references “attachment.” Consider renaming to maintain consistency:-const EventAtatchment = z.object({ +const EventAttachment = z.object({ fileId: z.string(), title: z.string(), fileUrl: z.string(), mimeType: z.string(), });
2-5
: Clarify placeholder schema usage.The exports
fileSchema
,userSchema
, etc. are string placeholders. If these strings do not map to actual validation schemas, consider inlining Zod definitions or adding clarifying comments to prevent confusion in future maintenance.
38-38
: Evaluate the completeness of enumerations.The set of
Apps
is still partial (e.g.,GoogleWorkspace
,Notion
,GoogleCalendar
, etc.). If your application supports additional integrations or sub-apps, ensureApps
remains updated or adopt a more dynamic architecture if product expansions are anticipated.server/distBE/integrations/google/ws.js (1)
1-2
: Exporting WebSocket Connections Map CorrectlyThe new constant
wsConnections
is clearly exported and correctly initialized as a newMap
. This design choice is well-suited for managing WebSocket connections in a key-value structure. Consider adding a brief JSDoc comment to explicitly document what keys and values are expected, which will help future contributors and maintainers understand its intended usage.server/distBE/shared/config.js (1)
1-7
: Environment Configuration Setup is AppropriateThis file properly loads environment variables via
dotenv.config()
and exports a default configuration object containingserverUrl
. The use of a fallback value to"http://localhost:3000"
ensures the application has a default behavior when the environment variable is missing. For improved clarity and future maintainability, you might consider adding inline documentation or JSDoc annotations describing the configuration parameters—especially if more configurations are expected to be added later.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
package-lock.json
is excluded by!**/package-lock.json
server/package-lock.json
is excluded by!**/package-lock.json
📒 Files selected for processing (75)
package.json
(1 hunks)server/distBE/ai/contants.js
(1 hunks)server/distBE/ai/context.js
(1 hunks)server/distBE/ai/eval.js
(1 hunks)server/distBE/ai/mappers.js
(1 hunks)server/distBE/ai/prompts.js
(1 hunks)server/distBE/ai/provider/base.js
(1 hunks)server/distBE/ai/provider/bedrock.js
(1 hunks)server/distBE/ai/provider/fireworks.js
(1 hunks)server/distBE/ai/provider/fireworksClient.js
(1 hunks)server/distBE/ai/provider/gemini.js
(1 hunks)server/distBE/ai/provider/index.js
(1 hunks)server/distBE/ai/provider/ollama.js
(1 hunks)server/distBE/ai/provider/openai.js
(1 hunks)server/distBE/ai/provider/together.js
(1 hunks)server/distBE/ai/types.js
(1 hunks)server/distBE/api/admin.js
(1 hunks)server/distBE/api/auth.js
(1 hunks)server/distBE/api/chat.js
(1 hunks)server/distBE/api/oauth.js
(1 hunks)server/distBE/api/proxy.js
(1 hunks)server/distBE/api/search.js
(1 hunks)server/distBE/chunks.js
(1 hunks)server/distBE/cleanup.js
(1 hunks)server/distBE/config.js
(1 hunks)server/distBE/db/chat.js
(1 hunks)server/distBE/db/client.js
(1 hunks)server/distBE/db/connector.js
(1 hunks)server/distBE/db/customType.js
(1 hunks)server/distBE/db/message.js
(1 hunks)server/distBE/db/oauthProvider.js
(1 hunks)server/distBE/db/schema.js
(1 hunks)server/distBE/db/seed.js
(1 hunks)server/distBE/db/syncHistory.js
(1 hunks)server/distBE/db/syncJob.js
(1 hunks)server/distBE/db/user.js
(1 hunks)server/distBE/db/workspace.js
(1 hunks)server/distBE/doc.js
(1 hunks)server/distBE/drizzle.config.js
(1 hunks)server/distBE/errors/index.js
(1 hunks)server/distBE/eval/evaluate.js
(1 hunks)server/distBE/eval/processDataForVespa.js
(1 hunks)server/distBE/eval/qrelsToTrecFormat.js
(1 hunks)server/distBE/integrations/google/config.js
(1 hunks)server/distBE/integrations/google/gmail-worker.js
(1 hunks)server/distBE/integrations/google/gmail/index.js
(1 hunks)server/distBE/integrations/google/gmail/quote-parser.js
(1 hunks)server/distBE/integrations/google/index.js
(1 hunks)server/distBE/integrations/google/pdf-utils.js
(1 hunks)server/distBE/integrations/google/sync.js
(1 hunks)server/distBE/integrations/google/tracking.js
(1 hunks)server/distBE/integrations/google/utils.js
(1 hunks)server/distBE/integrations/google/worker-utils.js
(1 hunks)server/distBE/integrations/google/ws.js
(1 hunks)server/distBE/integrations/types/gmail.js
(1 hunks)server/distBE/logger/index.js
(1 hunks)server/distBE/mime.js
(1 hunks)server/distBE/pipeline.js
(1 hunks)server/distBE/queue/index.js
(1 hunks)server/distBE/search/mappers.js
(1 hunks)server/distBE/search/types.js
(1 hunks)server/distBE/search/vespa.js
(1 hunks)server/distBE/search/vespaClient.js
(1 hunks)server/distBE/server.js
(1 hunks)server/distBE/shared/config.js
(1 hunks)server/distBE/shared/types.js
(1 hunks)server/distBE/types.js
(1 hunks)server/distBE/utils.js
(1 hunks)server/distBE/utils/encryption.js
(1 hunks)server/distBE/utils/index.js
(1 hunks)server/distBE/utils/retry-async.js
(1 hunks)server/distBE/vespa/replaceDIMS.js
(1 hunks)server/package.json
(3 hunks)server/tests/fetchRetry.test.ts
(1 hunks)server/tsconfig.json
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (46)
server/distBE/search/mappers.js (2)
server/distBE/search/types.js (14)
fileSchema
(2-2)fileSchema
(2-2)userSchema
(3-3)userSchema
(3-3)mailSchema
(4-4)mailSchema
(4-4)eventSchema
(5-5)eventSchema
(5-5)mailAttachmentSchema
(7-7)mailAttachmentSchema
(7-7)MailAttachmentResponseSchema
(489-504)MailAttachmentResponseSchema
(489-504)userQuerySchema
(6-6)userQuerySchema
(6-6)server/distBE/shared/types.js (18)
FileResponseSchema
(134-153)FileResponseSchema
(134-153)UserResponseSchema
(169-180)UserResponseSchema
(169-180)EventResponseSchema
(154-168)EventResponseSchema
(154-168)AutocompleteFileSchema
(61-69)AutocompleteFileSchema
(61-69)AutocompleteUserSchema
(70-81)AutocompleteUserSchema
(70-81)AutocompleteMailSchema
(90-101)AutocompleteMailSchema
(90-101)AutocompleteEventSchema
(112-121)AutocompleteEventSchema
(112-121)AutocompleteUserQueryHSchema
(82-89)AutocompleteUserQueryHSchema
(82-89)AutocompleteMailAttachmentSchema
(102-111)AutocompleteMailAttachmentSchema
(102-111)
server/distBE/db/client.js (2)
server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/queue/index.js (1)
url
(19-19)
server/distBE/ai/provider/fireworks.js (5)
server/distBE/ai/provider/bedrock.js (1)
Logger
(8-8)server/distBE/ai/provider/gemini.js (1)
Logger
(6-6)server/distBE/ai/provider/ollama.js (1)
Logger
(6-6)server/distBE/ai/provider/together.js (1)
Logger
(6-6)server/distBE/ai/provider/fireworksClient.js (2)
messages
(135-135)response
(142-142)
server/distBE/api/admin.js (7)
server/distBE/db/connector.js (3)
Logger
(20-20)insertConnector
(21-55)insertConnector
(21-55)server/distBE/db/oauthProvider.js (5)
Logger
(7-7)getOAuthProvider
(24-36)getOAuthProvider
(24-36)createOAuthProvider
(8-23)createOAuthProvider
(8-23)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/db/user.js (2)
getUserByEmail
(61-67)getUserByEmail
(61-67)server/distBE/utils.js (3)
setCookieByEnv
(43-57)setCookieByEnv
(43-57)data
(16-16)server/distBE/types.js (2)
createOAuthProvider
(52-57)createOAuthProvider
(52-57)server/distBE/config.js (1)
serviceAccountWhitelistedEmails
(91-91)
server/distBE/integrations/google/worker-utils.js (3)
server/distBE/integrations/google/pdf-utils.js (4)
error
(57-57)safeLoadPDF
(51-68)deleteDocument
(10-27)deleteDocument
(10-27)server/distBE/integrations/google/config.js (2)
MAX_ATTACHMENT_PDF_SIZE
(21-21)MAX_ATTACHMENT_PDF_SIZE
(21-21)server/distBE/utils.js (2)
retryWithBackoff
(92-132)retryWithBackoff
(92-132)
server/distBE/integrations/google/gmail/index.js (4)
server/distBE/integrations/google/config.js (2)
GmailConcurrency
(19-19)GmailConcurrency
(19-19)server/distBE/search/types.js (4)
mailSchema
(4-4)mailSchema
(4-4)mailAttachmentSchema
(7-7)mailAttachmentSchema
(7-7)server/distBE/integrations/google/worker-utils.js (7)
attachments
(72-72)filenames
(73-73)parseAttachments
(71-92)parseAttachments
(71-92)attachmentChunks
(36-36)getGmailAttachmentChunks
(34-69)getGmailAttachmentChunks
(34-69)server/distBE/integrations/google/gmail/quote-parser.js (2)
parseEmailBody
(67-155)parseEmailBody
(67-155)
server/distBE/api/oauth.js (6)
server/distBE/db/connector.js (4)
Logger
(20-20)updateConnector
(173-185)updateConnector
(173-185)server/distBE/db/oauthProvider.js (3)
Logger
(7-7)getOAuthProvider
(24-36)getOAuthProvider
(24-36)server/distBE/integrations/google/index.js (8)
app
(1667-1667)connector
(189-189)connector
(528-531)connector
(719-719)handleGoogleOAuthIngestion
(522-658)handleGoogleOAuthIngestion
(522-658)error
(1441-1441)server/distBE/db/user.js (2)
getUserByEmail
(61-67)getUserByEmail
(61-67)server/distBE/errors/index.js (2)
NoUserFound
(247-252)OAuthCallbackError
(240-245)server/distBE/queue/index.js (2)
boss
(20-23)boss
(20-23)
server/distBE/logger/index.js (3)
server/distBE/eval/evaluate.js (1)
start
(30-30)server/distBE/utils/retry-async.js (1)
start
(5-5)server/distBE/api/chat.js (4)
c
(128-128)c
(143-143)c
(161-161)c
(164-164)
server/distBE/api/auth.js (5)
server/distBE/api/admin.js (8)
config
(10-10)c
(19-19)c
(66-66)c
(68-68)c
(80-80)c
(117-117)server/distBE/api/oauth.js (4)
config
(16-16)c
(20-20)c
(22-22)server/distBE/api/search.js (9)
config
(44-45)c
(87-87)c
(108-108)c
(164-164)c
(167-167)userAndWorkspace
(169-172)server/distBE/api/chat.js (13)
config
(58-58)c
(128-128)c
(143-143)c
(161-161)c
(164-164)c
(698-698)userAndWorkspace
(709-709)userAndWorkspace
(710-710)userAndWorkspace
(1122-1122)userAndWorkspace
(1123-1123)server/distBE/db/user.js (6)
userAndWorkspace
(14-23)userAndWorkspace
(27-27)userAndWorkspace
(33-42)userAndWorkspace
(46-46)getPublicUserAndWorkspaceByEmail
(13-31)getPublicUserAndWorkspaceByEmail
(13-31)
server/distBE/ai/provider/bedrock.js (13)
server/distBE/ai/provider/fireworks.js (1)
Logger
(6-6)server/distBE/ai/provider/gemini.js (1)
Logger
(6-6)server/distBE/ai/provider/ollama.js (1)
Logger
(6-6)server/distBE/ai/provider/together.js (1)
Logger
(6-6)server/distBE/api/admin.js (2)
Logger
(17-17)config
(10-10)server/distBE/api/oauth.js (2)
Logger
(17-17)config
(16-16)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/ai/provider/base.js (1)
config
(4-4)server/distBE/api/chat.js (2)
config
(58-58)cost
(723-723)server/distBE/api/proxy.js (1)
response
(3-3)server/distBE/utils/index.js (2)
calculateCost
(10-14)calculateCost
(10-14)server/distBE/ai/mappers.js (2)
modelDetailsMap
(2-234)modelDetailsMap
(2-234)server/distBE/config.js (2)
EndThinkingToken
(83-83)StartThinkingToken
(82-82)
server/distBE/server.js (7)
server/distBE/api/admin.js (18)
clientId
(89-89)clientSecret
(90-90)config
(10-10)Logger
(17-17)app
(92-92)app
(128-128)c
(19-19)c
(66-66)c
(68-68)c
(80-80)c
(117-117)CreateOAuthProvider
(79-114)CreateOAuthProvider
(79-114)user
(25-25)user
(86-86)user
(123-123)server/distBE/api/oauth.js (9)
config
(16-16)Logger
(17-17)c
(20-20)c
(22-22)OAuthCallback
(18-81)OAuthCallback
(18-81)connectorId
(51-51)JSON
(26-26)server/distBE/api/search.js (11)
config
(44-45)Logger
(43-43)c
(87-87)c
(108-108)c
(164-164)c
(167-167)SearchApi
(107-162)SearchApi
(107-162)server/distBE/api/chat.js (15)
config
(58-58)c
(128-128)c
(143-143)c
(161-161)c
(164-164)GetChatApi
(103-124)GetChatApi
(103-124)token
(349-349)token
(443-443)token
(597-597)message
(285-285)message
(526-526)server/distBE/config.js (1)
postOauthRedirect
(9-9)server/distBE/db/workspace.js (5)
name
(54-54)getWorkspaceByDomain
(18-24)getWorkspaceByDomain
(18-24)createWorkspace
(51-65)createWorkspace
(51-65)server/distBE/db/user.js (4)
getUserByEmail
(61-67)getUserByEmail
(61-67)createUser
(68-88)createUser
(68-88)
server/distBE/eval/evaluate.js (4)
server/distBE/eval/qrelsToTrecFormat.js (2)
Logger
(9-9)processedResultsData
(29-29)server/distBE/eval/processDataForVespa.js (2)
Logger
(20-20)user
(34-34)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/search/vespa.js (2)
searchVespa
(320-351)searchVespa
(320-351)
server/distBE/eval/processDataForVespa.js (19)
server/distBE/eval/evaluate.js (5)
expectedArgsLen
(3-3)requiredArgs
(4-4)argMap
(9-9)Logger
(29-29)user
(33-33)server/distBE/eval/qrelsToTrecFormat.js (5)
expectedArgsLen
(3-3)requiredArgs
(4-4)argMap
(10-10)Logger
(9-9)filePath
(5-5)server/distBE/server.js (3)
Logger
(72-72)user
(180-180)user
(222-230)server/distBE/api/admin.js (6)
Logger
(17-17)path
(60-60)user
(25-25)user
(86-86)user
(123-123)data
(126-126)server/distBE/api/oauth.js (2)
Logger
(17-17)JSON
(26-26)server/distBE/db/client.js (1)
Logger
(6-6)server/distBE/db/connector.js (1)
Logger
(20-20)server/distBE/api/search.js (1)
Logger
(43-43)server/distBE/db/oauthProvider.js (1)
Logger
(7-7)server/distBE/db/seed.js (1)
Logger
(7-7)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/integrations/google/sync.js (5)
user
(366-366)user
(526-526)user
(1045-1045)data
(358-358)data
(1037-1037)server/distBE/integrations/google/gmail/index.js (2)
data
(260-260)chunks
(152-152)server/distBE/integrations/google/index.js (6)
data
(188-188)chunks
(931-931)chunks
(1227-1227)chunks
(1297-1297)filePath
(1408-1408)error
(1441-1441)server/distBE/utils.js (1)
data
(16-16)server/distBE/chunks.js (4)
chunks
(23-23)chunks
(97-97)chunkDocument
(95-105)chunkDocument
(95-105)server/distBE/integrations/google/utils.js (3)
chunks
(113-113)chunks
(163-163)filePath
(299-299)server/distBE/integrations/google/pdf-utils.js (2)
filePath
(29-29)error
(57-57)server/distBE/api/chat.js (1)
item
(265-265)
server/distBE/db/oauthProvider.js (2)
server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/db/schema.js (2)
oauthProviders
(123-149)oauthProviders
(123-149)
server/distBE/ai/provider/ollama.js (6)
server/distBE/ai/provider/bedrock.js (1)
Logger
(8-8)server/distBE/ai/provider/fireworks.js (1)
Logger
(6-6)server/distBE/ai/provider/gemini.js (1)
Logger
(6-6)server/distBE/ai/provider/together.js (1)
Logger
(6-6)server/distBE/ai/provider/index.js (1)
Logger
(26-26)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)
server/distBE/eval/qrelsToTrecFormat.js (3)
server/distBE/eval/evaluate.js (10)
args
(2-2)expectedArgsLen
(3-3)requiredArgs
(4-4)Logger
(29-29)argMap
(9-9)readline
(28-28)processedResultsData
(31-31)fileStream
(56-56)rl
(57-60)columns
(62-62)server/distBE/eval/processDataForVespa.js (7)
args
(2-2)expectedArgsLen
(3-3)requiredArgs
(4-4)Logger
(20-20)argMap
(9-9)fileStream
(53-53)rl
(54-57)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)
server/distBE/ai/eval.js (3)
server/distBE/search/vespa.js (7)
data
(460-460)timestamp
(477-477)config
(568-568)response
(450-455)searchVespa
(320-351)searchVespa
(320-351)server/distBE/ai/context.js (2)
userContext
(284-295)userContext
(284-295)server/distBE/db/user.js (6)
userAndWorkspace
(14-23)userAndWorkspace
(27-27)userAndWorkspace
(33-42)userAndWorkspace
(46-46)getUserAndWorkspaceByEmail
(32-48)getUserAndWorkspaceByEmail
(32-48)
server/distBE/ai/context.js (1)
server/distBE/search/types.js (10)
fileSchema
(2-2)fileSchema
(2-2)userSchema
(3-3)userSchema
(3-3)mailSchema
(4-4)mailSchema
(4-4)mailAttachmentSchema
(7-7)mailAttachmentSchema
(7-7)eventSchema
(5-5)eventSchema
(5-5)
server/distBE/ai/provider/together.js (6)
server/distBE/ai/provider/bedrock.js (1)
Logger
(8-8)server/distBE/ai/provider/fireworks.js (1)
Logger
(6-6)server/distBE/ai/provider/gemini.js (1)
Logger
(6-6)server/distBE/ai/provider/ollama.js (1)
Logger
(6-6)server/distBE/ai/provider/index.js (1)
Logger
(26-26)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)
server/distBE/chunks.js (3)
server/distBE/eval/processDataForVespa.js (1)
chunks
(36-36)server/distBE/integrations/google/gmail/index.js (4)
chunks
(152-152)body
(151-151)body
(177-177)body
(233-233)server/distBE/integrations/google/utils.js (2)
chunks
(113-113)chunks
(163-163)
server/distBE/ai/provider/openai.js (2)
server/distBE/utils/index.js (2)
calculateCost
(10-14)calculateCost
(10-14)server/distBE/ai/mappers.js (2)
modelDetailsMap
(2-234)modelDetailsMap
(2-234)
server/distBE/ai/provider/gemini.js (7)
server/distBE/ai/provider/bedrock.js (1)
Logger
(8-8)server/distBE/ai/provider/fireworks.js (1)
Logger
(6-6)server/distBE/ai/provider/ollama.js (1)
Logger
(6-6)server/distBE/ai/provider/together.js (1)
Logger
(6-6)server/distBE/ai/provider/index.js (1)
Logger
(26-26)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/api/chat.js (1)
cost
(723-723)
server/distBE/config.js (1)
server/distBE/server.js (1)
postOauthRedirect
(69-69)
server/distBE/integrations/google/config.js (1)
server/distBE/api/admin.js (1)
scopes
(91-91)
server/distBE/integrations/google/pdf-utils.js (5)
server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/integrations/google/index.js (15)
downloadDir
(1398-1398)downloadDir
(1398-1398)filePath
(1408-1408)file
(1871-1871)res
(117-124)res
(146-156)res
(404-418)res
(1410-1420)res
(1753-1774)res
(1928-1940)error
(1441-1441)message
(678-678)message
(680-680)message
(687-687)name
(1657-1657)server/distBE/integrations/google/utils.js (2)
filePath
(299-299)file
(79-82)server/distBE/errors/index.js (1)
DeleteDocumentError
(277-286)server/distBE/integrations/google/worker-utils.js (1)
fileName
(44-44)
server/distBE/integrations/google/sync.js (6)
server/distBE/db/connector.js (8)
Logger
(20-20)getOAuthConnectorWithCredentials
(106-156)getOAuthConnectorWithCredentials
(106-156)res
(58-68)res
(74-78)res
(107-111)res
(158-162)server/distBE/integrations/google/index.js (47)
sheets
(1362-1362)getSpreadsheet
(1195-1219)getSpreadsheet
(1195-1219)error
(1441-1441)file
(1871-1871)data
(188-188)response
(1156-1167)response
(1590-1603)response
(1613-1626)syncJobs
(175-179)syncJob
(187-187)syncJob
(233-233)connector
(189-189)connector
(528-531)connector
(719-719)gmail
(1879-1879)getJoiningLink
(311-328)getJoiningLink
(311-328)getJoiningLink
(433-433)getAttendeesOfEvent
(329-351)getAttendeesOfEvent
(329-351)getAttendeesOfEvent
(434-435)getAttachments
(352-368)getAttachments
(352-368)getAttachments
(436-438)getEventStartTime
(372-386)getEventStartTime
(372-386)getEventStartTime
(439-439)getTextFromEventDescription
(258-260)getTextFromEventDescription
(258-260)attendeesInfo
(333-333)attendeesNames
(334-334)attachmentsInfo
(356-356)getUniqueEmails
(369-371)getUniqueEmails
(369-371)attendeesEmails
(335-335)userEmail
(532-532)res
(117-124)res
(146-156)res
(404-418)res
(1410-1420)res
(1753-1774)res
(1928-1940)message
(678-678)message
(680-680)message
(687-687)server/distBE/integrations/google/utils.js (8)
sheets
(211-211)permissions
(260-260)file
(79-82)getFile
(75-99)getFile
(75-99)getFileContent
(100-143)getFileContent
(100-143)entity
(221-221)server/distBE/integrations/google/gmail/index.js (6)
permissions
(147-149)data
(260-260)gmail
(30-34)parseMail
(110-231)parseMail
(110-231)labels
(114-114)server/distBE/db/syncJob.js (4)
getAppSyncJobs
(21-27)getAppSyncJobs
(21-27)updateSyncJob
(28-43)updateSyncJob
(28-43)server/distBE/db/user.js (2)
getUserById
(89-99)getUserById
(89-99)
server/distBE/db/user.js (1)
server/distBE/db/schema.js (10)
users
(51-75)users
(51-75)workspaces
(33-48)workspaces
(33-48)userPublicSchema
(309-316)userPublicSchema
(309-316)workspacePublicSchema
(317-322)workspacePublicSchema
(317-322)selectUserSchema
(307-307)selectUserSchema
(307-307)
server/distBE/utils.js (4)
server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/search/vespa.js (1)
data
(460-460)server/distBE/integrations/google/sync.js (3)
data
(358-358)data
(1037-1037)retryWithBackoff
(1051-1051)server/distBE/api/auth.js (1)
c
(7-7)
server/distBE/integrations/google/tracking.js (8)
server/distBE/server.js (1)
server/distBE/api/auth.js (1)
server/distBE/api/admin.js (2)
server/distBE/ai/eval.js (4)
server/distBE/api/oauth.js (1)
server/distBE/api/search.js (3)
server/distBE/api/chat.js (3)
server/distBE/integrations/google/index.js (1)
server/distBE/db/chat.js (1)
server/distBE/db/schema.js (8)
chats
(214-239)chats
(214-239)selectChatSchema
(326-326)selectChatSchema
(326-326)selectMessageSchema
(331-331)selectMessageSchema
(331-331)selectPublicChatSchema
(338-341)selectPublicChatSchema
(338-341)
server/distBE/integrations/google/utils.js (3)
server/distBE/integrations/google/index.js (13)
drive
(1459-1459)drive
(1750-1750)drive
(1924-1924)file
(1871-1871)error
(1441-1441)chunks
(931-931)chunks
(1227-1227)chunks
(1297-1297)downloadPDF
(1407-1434)downloadPDF
(1407-1434)downloadDir
(1398-1398)downloadDir
(1398-1398)filePath
(1408-1408)server/distBE/integrations/google/pdf-utils.js (7)
file
(30-30)error
(57-57)downloadPDF
(28-49)downloadPDF
(28-49)downloadDir
(9-9)downloadDir
(9-9)filePath
(29-29)server/distBE/integrations/google/config.js (2)
MAX_GD_PDF_SIZE
(13-13)MAX_GD_PDF_SIZE
(13-13)
server/distBE/doc.js (2)
server/distBE/chunks.js (1)
cleanText
(15-18)server/distBE/integrations/google/utils.js (2)
footnotes
(110-110)headerFooter
(111-111)
server/distBE/api/search.js (6)
server/distBE/ai/provider/index.js (2)
Logger
(26-26)config
(5-5)server/distBE/search/vespa.js (10)
config
(568-568)autocomplete
(112-166)autocomplete
(112-166)getTimestamp
(549-564)getTimestamp
(549-564)groupVespaSearch
(301-319)groupVespaSearch
(301-319)searchVespa
(320-351)searchVespa
(320-351)server/distBE/search/mappers.js (2)
VespaAutocompleteResponseToResult
(79-138)VespaAutocompleteResponseToResult
(79-138)server/distBE/db/user.js (6)
userAndWorkspace
(14-23)userAndWorkspace
(27-27)userAndWorkspace
(33-42)userAndWorkspace
(46-46)getPublicUserAndWorkspaceByEmail
(13-31)getPublicUserAndWorkspaceByEmail
(13-31)server/distBE/ai/context.js (2)
userContext
(284-295)userContext
(284-295)server/distBE/search/types.js (2)
userSchema
(3-3)userSchema
(3-3)
server/distBE/queue/index.js (3)
server/distBE/integrations/google/sync.js (15)
config
(369-369)config
(498-498)config
(529-529)config
(573-573)config
(604-604)config
(648-648)config
(1049-1049)handleGoogleOAuthChanges
(356-670)handleGoogleOAuthChanges
(356-670)errorMessage
(496-496)errorMessage
(571-571)errorMessage
(646-646)errorMessage
(1192-1192)errorMessage
(1265-1265)errorMessage
(1338-1338)server/distBE/integrations/google/index.js (5)
error
(1441-1441)errorMessage
(224-224)errorMessage
(1103-1103)syncGoogleWorkspace
(173-257)syncGoogleWorkspace
(173-257)server/distBE/integrations/google/utils.js (2)
checkDownloadsFolder
(287-308)checkDownloadsFolder
(287-308)
server/distBE/api/chat.js (13)
server/distBE/ai/provider/bedrock.js (2)
config
(10-10)Logger
(8-8)server/distBE/ai/provider/index.js (17)
config
(5-5)Logger
(26-26)messages
(299-308)messages
(392-394)messages
(414-416)messages
(432-434)messages
(445-455)messages
(483-485)messages
(535-537)messages
(563-565)messages
(583-585)messages
(612-614)messages
(658-660)messages
(678-680)messages
(697-699)messages
(726-728)messages
(762-764)server/distBE/ai/provider/fireworks.js (1)
Logger
(6-6)server/distBE/ai/provider/gemini.js (1)
Logger
(6-6)server/distBE/ai/provider/ollama.js (1)
Logger
(6-6)server/distBE/ai/provider/together.js (1)
Logger
(6-6)server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/ai/provider/fireworksClient.js (1)
messages
(135-135)server/distBE/db/chat.js (10)
getChatByExternalId
(36-45)getChatByExternalId
(36-45)updateChatByExternalId
(46-57)updateChatByExternalId
(46-57)deleteMessagesByChatId
(68-77)deleteMessagesByChatId
(68-77)deleteChatByExternalId
(58-67)deleteChatByExternalId
(58-67)getPublicChats
(90-99)getPublicChats
(90-99)server/distBE/db/message.js (8)
getChatMessages
(24-31)getChatMessages
(24-31)getMessageByExternalId
(40-49)getMessageByExternalId
(40-49)getChatMessagesBefore
(32-39)getChatMessagesBefore
(32-39)updateMessage
(50-55)updateMessage
(50-55)server/distBE/db/schema.js (4)
selectPublicChatSchema
(338-341)selectPublicChatSchema
(338-341)selectPublicMessagesSchema
(337-337)selectPublicMessagesSchema
(337-337)server/distBE/utils.js (2)
getErrorMessage
(60-64)getErrorMessage
(60-64)server/distBE/ai/context.js (2)
answerContextMap
(211-230)answerContextMap
(211-230)
server/distBE/db/connector.js (3)
server/distBE/db/oauthProvider.js (5)
externalId
(9-9)res
(25-29)res
(38-42)getOAuthProviderByConnectorId
(37-49)getOAuthProviderByConnectorId
(37-49)server/distBE/db/schema.js (4)
connectors
(87-120)connectors
(87-120)selectConnectorSchema
(284-287)selectConnectorSchema
(284-287)server/distBE/errors/index.js (6)
ConnectionInsertionError
(233-238)NoConnectorsFound
(226-231)NoOauthConnectorFound
(219-224)MissingOauthConnectorCredentialsError
(194-203)FetchProviderFailed
(212-217)UpdateConnectorFailed
(205-210)
server/distBE/types.js (4)
server/distBE/api/auth.js (1)
config
(5-5)server/distBE/api/oauth.js (1)
config
(16-16)server/distBE/api/search.js (1)
config
(44-45)server/distBE/db/oauthProvider.js (2)
createOAuthProvider
(8-23)createOAuthProvider
(8-23)
server/distBE/shared/types.js (1)
server/distBE/search/types.js (24)
fileSchema
(2-2)fileSchema
(2-2)entitySchema
(84-91)entitySchema
(84-91)userSchema
(3-3)userSchema
(3-3)userQuerySchema
(6-6)userQuerySchema
(6-6)mailSchema
(4-4)mailSchema
(4-4)mailAttachmentSchema
(7-7)mailAttachmentSchema
(7-7)eventSchema
(5-5)eventSchema
(5-5)VespaFileSchema
(103-118)VespaFileSchema
(103-118)VespaEventSchema
(214-239)VespaEventSchema
(214-239)VespaUserSchema
(129-162)VespaUserSchema
(129-162)MailResponseSchema
(473-488)MailResponseSchema
(473-488)MailAttachmentResponseSchema
(489-504)MailAttachmentResponseSchema
(489-504)
server/distBE/db/syncJob.js (1)
server/distBE/db/schema.js (6)
insertSyncJob
(288-294)insertSyncJob
(288-294)syncJobs
(152-181)syncJobs
(152-181)selectSyncJobSchema
(295-297)selectSyncJobSchema
(295-297)
server/distBE/db/schema.js (3)
server/distBE/utils/encryption.js (1)
Encryption
(2-41)server/distBE/db/customType.js (2)
encryptedText
(7-35)encryptedText
(7-35)server/distBE/types.js (2)
SyncConfigSchema
(123-123)SyncConfigSchema
(123-123)
server/distBE/search/vespa.js (1)
server/distBE/search/types.js (12)
fileSchema
(2-2)fileSchema
(2-2)userSchema
(3-3)userSchema
(3-3)mailSchema
(4-4)mailSchema
(4-4)eventSchema
(5-5)eventSchema
(5-5)mailAttachmentSchema
(7-7)mailAttachmentSchema
(7-7)userQuerySchema
(6-6)userQuerySchema
(6-6)
server/distBE/search/vespaClient.js (3)
server/distBE/logger/index.js (2)
getLogger
(18-42)getLogger
(18-42)server/distBE/utils.js (2)
getErrorMessage
(60-64)getErrorMessage
(60-64)server/distBE/search/mappers.js (1)
handleVespaGroupResponse
(139-167)
server/distBE/ai/provider/index.js (4)
server/distBE/config.js (14)
AwsAccessKey
(24-24)AwsSecretKey
(25-25)OpenAIKey
(26-26)OllamaModel
(27-27)TogetherAIModel
(28-28)TogetherApiKey
(31-31)FireworksAIModel
(29-29)FireworksApiKey
(32-32)GeminiAIModel
(30-30)GeminiApiKey
(33-33)defaultBestModel
(23-23)defaultFastModel
(22-22)isReasoning
(34-34)EndThinkingToken
(83-83)server/distBE/ai/mappers.js (2)
ModelToProviderMap
(235-255)ModelToProviderMap
(235-255)server/distBE/ai/types.js (6)
QueryContextRank
(98-101)QueryContextRank
(98-101)QueryAnalysisSchema
(50-54)QueryAnalysisSchema
(50-54)QueryRouterResponseSchema
(81-97)QueryRouterResponseSchema
(81-97)server/distBE/utils.js (2)
getErrorMessage
(60-64)getErrorMessage
(60-64)
server/distBE/ai/prompts.js (2)
server/distBE/utils/index.js (1)
getDateForAI
(1-9)server/distBE/ai/context.js (2)
userContext
(284-295)userContext
(284-295)
server/distBE/integrations/google/index.js (6)
server/distBE/db/connector.js (8)
Logger
(20-20)res
(58-68)res
(74-78)res
(107-111)res
(158-162)getConnector
(73-92)getConnector
(73-92)server/distBE/integrations/google/gmail/index.js (10)
nextPageToken
(36-36)data
(260-260)subject
(133-133)historyId
(39-39)handleGmailIngestion
(27-84)handleGmailIngestion
(27-84)limit
(37-37)batchSize
(28-28)gmail
(30-34)profile
(38-38)server/distBE/db/syncJob.js (6)
getAppSyncJobs
(21-27)getAppSyncJobs
(21-27)updateSyncJob
(28-43)updateSyncJob
(28-43)insertSyncJob
(5-20)insertSyncJob
(5-20)server/distBE/integrations/google/utils.js (4)
createJwtClient
(279-286)createJwtClient
(279-286)entity
(221-221)file
(79-82)server/distBE/integrations/google/ws.js (2)
wsConnections
(1-1)wsConnections
(1-1)server/distBE/integrations/google/config.js (14)
ServiceAccountUserConcurrency
(17-17)ServiceAccountUserConcurrency
(17-17)MAX_GD_SLIDES_TEXT_LEN
(16-16)MAX_GD_SLIDES_TEXT_LEN
(16-16)MAX_GD_SHEET_TEXT_LEN
(15-15)MAX_GD_SHEET_TEXT_LEN
(15-15)MAX_GD_SHEET_ROWS
(14-14)MAX_GD_SHEET_ROWS
(14-14)PDFProcessingConcurrency
(20-20)PDFProcessingConcurrency
(20-20)MAX_GD_PDF_SIZE
(13-13)MAX_GD_PDF_SIZE
(13-13)GoogleDocsConcurrency
(18-18)GoogleDocsConcurrency
(18-18)
server/distBE/search/types.js (1)
server/distBE/integrations/google/utils.js (1)
entity
(221-221)
🔇 Additional comments (85)
server/distBE/cleanup.js (2)
1-33
: Check list of elements for thoroughness
These elements are comprehensively listed to strip potentially harmful or undesired DOM nodes. Confirm that this covers all relevant embedded content types needed for your use case (e.g., custom elements or newer HTML5 elements).
45-56
:❓ Verification inconclusive
Potential performance overhead
Looping over every element in the document and removing them on-the-fly could be expensive for large or deeply nested DOMs. Verify that this cleanup is intentionally scoped and tested for performance.
🏁 Script executed:
#!/bin/bash # Checking usage references of cleanup function rg -A 5 "cleanup("Length of output: 87
Action: Verify Cleanup Function Usage and Performance Impact
It appears that the original reference search for the
cleanup
function was affected by a regex error (unescaped(
). Please re-run the above script (or verify manually) to confirm if there are any usage references for thecleanup
function. Additionally, since the function iterates over the entire DOM to remove elements and attributes on-the-fly, double-check that this cleanup is intentionally scoped and that performance has been appropriately tested for large or deeply nested DOMs.
🏁 Script executed:
#!/bin/bash # Re-check cleanup function usage with corrected regex pattern (escaped parenthesis) rg -A 5 "cleanup\(" .Length of output: 570
Review Comment Revision: Performance Overhead in Cleanup Function
The cleanup function in both
server/cleanup.ts
and its transpiled versionserver/distBE/cleanup.js
iterates over every DOM element usingdocument.querySelectorAll("*")
and performs removal operations. This could introduce performance overhead, especially for large or deeply nested DOMs.
- Confirm that this function is only applied in scenarios with manageable DOM sizes or include performance tests to validate its impact.
- Consider documenting the expected scope and limitations of this cleanup process to ensure maintainability.
server/distBE/vespa/replaceDIMS.js (1)
10-21
:✅ Verification successful
Regex approach might be too broad
This regex replaces anyv[...]
occurrences with the given dimension. Verify that no unrelated placeholders or texts will be unintentionally changed.
🏁 Script executed:
#!/bin/bash # Searching for usage of v[...] patterns across .sd files to ensure they are exclusively for dimension placeholders fd -e sd --exec rg -A 2 "v\[\w+\]"Length of output: 1662
Regex Pattern Verified – No Unintended Replacements Detected
The regex
/v\[(?:DIMS|\d+)\]/g
inserver/distBE/vespa/replaceDIMS.js
(lines 10–21) was inspected against our Vespa schema files. The search revealed that the only matches are the expected dimension placeholders (e.g.,v[DIMS]
), with no evidence of unrelated or accidental matches. This confirms that the regex is scoped appropriately for its intended purpose.
- File Reviewed:
server/distBE/vespa/replaceDIMS.js
- Verified Occurrences: The pattern appears only in valid dimension placeholder contexts in our
.sd
files.There is no further action required regarding this regex.
server/distBE/doc.js (5)
14-20
: Minimal error handling
Accessingelement.textRun?.content
is mostly safe, but you could add a minor check or fallback for clarity. Overall, the implementation is fine.
85-89
: Markdown heading logic
Generating headings via#
symbols based onHEADING_x
styles is straightforward and correct.
207-217
: Footnote references mismatch repeated
As noted,footnoteNumber
references may not align withfootnoteId
inextractFootnotes
. A consistent scheme is needed.
219-236
: Headers and footers extraction
Logic is consistent with footnotes. Works well for capturing header/footer content in a straightforward manner.
110-151
:❓ Verification inconclusive
Verify performance on large documents
This recursive extraction can be heavy for deeply nested or lengthy documents. Suggest performance testing or chunk processing.
🏁 Script executed:
#!/bin/bash # Searching for usage of extractText to understand possible large doc scenarios rg -A 5 "extractText("Length of output: 99
Below is a revised review comment that reflects the need for further verification using an updated search query. In particular, the original
rg
command failed due to an unescaped parenthesis. We now recommend re-running the search with the proper escaping to verify usage patterns ofextractText
for performance considerations. Please verify the output manually after running the re-run command.
Action Required: Verify Large Document Performance
The recursive extraction in
extractText
may be performance intensive for deeply nested or long documents. The initial search for its usage produced a regex error due to an unescaped parenthesis. Please re-run the following command to search for its usage across the codebase and manually verify if there are scenarios where chunk processing or other optimizations might be required:Ensure the output is checked for instances of heavy usage in large document contexts.
🏁 Script executed:
#!/bin/bash # Re-run the search for extractText usage with properly escaped parenthesis. rg -A 5 "extractText\("Length of output: 4878
Action Required: Assess Recursive Extraction for Performance
Our verification indicates that the recursive function
extractText
is widely used across the codebase (e.g., inserver/doc.ts
,server/distBE/doc.js
, and within Google integration modules). This reinforces the original concern: in cases of deeply nested or large documents, the recursion may become a performance bottleneck. Although one instance already employs chunk processing (viachunkDocument
), it remains essential to profile and test these paths to ensure acceptable performance.
- Action Item: Run performance tests on large and heavily nested documents to validate that
extractText
scales efficiently.- Consideration: Investigate whether additional chunking or iterative approaches might further optimize extraction where needed.
server/distBE/drizzle.config.js (1)
5-5
: Verify TypeScript schema file compatibilityThe schema path points to a TypeScript file (
./db/schema.ts
), but this is a JavaScript config file. Make sure the TypeScript file is properly compiled and available at runtime to avoid potential issues with the ORM configuration.server/distBE/mime.js (1)
1-186
: Comprehensive MIME type mapping looks good.The mapping is extensive and covers a wide range of file types with appropriate keywords for search functionality. The comments for ambiguous mappings provide helpful context.
server/package.json (2)
42-42
: Addition of dotenv for environment variable management.Adding dotenv is a good practice for managing environment variables in Node.js applications.
8-8
:✅ Verification successful
Verify that added dependencies are compatible with the Node environment.
The addition of
@faker-js/faker
,p-queue
, andtsc-alias
seems aligned with the PR objective to run server JS files in Node.Also applies to: 12-13
🏁 Script executed:
#!/bin/bash # Check if the added dependencies are compatible with Node.js echo "Checking Node.js compatibility for added dependencies:" npm info @faker-js/faker engines npm info p-queue engines npm info tsc-alias enginesLength of output: 285
Dependency Compatibility Verified
The shell script confirmed that:
- @faker-js/faker requires Node >=18.0.0
- p-queue requires Node >=18
- tsc-alias requires Node >=16.20.2
Given our server targets Node (with an expected version of 18 or later), these dependencies are compatible as used in
server/package.json
(Lines 8, and 12–13). No further modifications are needed.server/distBE/ai/mappers.js (2)
235-255
: Model to provider mapping is clear and consistent.The mapping between models and their providers is well-structured and aligns with the models defined in the
modelDetailsMap
.
2-234
:❓ Verification inconclusive
Comprehensive model details map with pricing information.
The
modelDetailsMap
provides detailed pricing information for various AI models, which is essential for cost tracking and billing. The structure is consistent for most models.Some models have inconsistent pricing structures. Verify if this is intentional:
🏁 Script executed:
#!/bin/bash # Check for models that only have onDemand pricing but not batch pricing grep -A 10 "onDemand" server/distBE/ai/mappers.js | grep -B 10 -A 2 "}"Length of output: 6157
Pricing Structure Consistency – Verification Needed
After running the verification script, it’s clear that several models in
server/distBE/ai/mappers.js
use onlyonDemand
pricing without a matchingbatch
pricing structure. Specifically, please verify if the following models are intentionally configured this way:
- [Models.CohereCmdRPlus]
- [Models.CohereCmdR]
- [Models.DeepSeek_R1]
- [Models.Mistral_Large]
If these omissions are by design (e.g., due to the model or provider not supporting batch pricing), consider adding an inline comment to clarify this intentional difference for future maintainers.
server/distBE/db/syncJob.js (2)
1-20
: LGTM: Well-structured implementation of insertSyncJob functionThe function properly generates a unique external ID, inserts the job into the database, and validates the result against the schema. Error handling is appropriate and the function follows best practices for database operations.
21-27
: LGTM: getAppSyncJobs implementation looks goodThe function correctly filters jobs by app and authentication type and validates the results with Zod. Good use of the array schema for validating multiple results.
server/tsconfig.json (2)
17-19
: Enabling JavaScript output for Node.js executionThe commenting out of these TypeScript-only options (
allowImportingTsExtensions
,verbatimModuleSyntax
, andnoEmit
) allows the compiler to generate JavaScript files instead of just type-checking. This aligns with the PR goal of running server JS files in Node.
28-29
: Changed output directory to support Node.js executionChanging the output directory from the commented-out
./distBE
to the active./distFinal
designates where the compiled JavaScript files will be placed. This is essential for the "Run Server JS files in Node" objective.server/distBE/config.js (2)
36-38
: Plan to address the TODO comment in a future updateThe TODO comment indicates a planned refactoring of model environment variables to simplify configuration. Make sure to track this for a future update.
This comment indicates upcoming changes to environment variable handling for models. Consider creating a tracking issue to ensure this refactoring is completed.
40-81
: LGTM: Well-structured model priority logicThe priority logic for AI models (AWS > OpenAI > Ollama > Together > Fireworks > Gemini) is well implemented with clear conditional blocks. The code properly sets default models based on available credentials.
server/distBE/db/syncHistory.js (1)
1-18
: LGTM: Well-implemented insertSyncHistory functionThe function follows the same pattern as other database operations, generating a unique ID, inserting the record, and validating the result. The error handling is appropriate and the function returns the validated data.
server/distBE/db/customType.js (1)
2-2
:✅ Verification successful
Validate relative import path for Encryption module.
Ensure
"../utils/encryption.js"
is correct and won't cause import resolution issues, especially if the file structure changes.
🏁 Script executed:
#!/bin/bash # Description: Verify that encryption.js exists in the expected location. fd encryption.js --exec ls -l {}Length of output: 121
Relative Import Path Verified
The relative import in
server/distBE/db/customType.js
at line 2 correctly points to../utils/encryption.js
since theencryption.js
file is located atserver/distBE/utils/encryption.js
. No changes are needed at this time. Please remember to adjust the import if the project’s file structure is refactored in the future.server/distBE/eval/evaluate.js (1)
37-55
: Handle JSON parse failures in input lines.If a line in the queries file isn't valid JSON,
JSON.parse
will throw an error. Consider wrapping the parse call in a try/catch to gracefully handle malformed inputs instead of crashing.Would you like me to generate a fix so the script can skip or report invalid JSON lines?
server/distBE/logger/index.js (1)
63-82
: Consider clarifying request ID handling.Currently, the middleware overwrites
requestId
if already present inc.req
. Ensure the fallback logic doesn't unintentionally replace a valid pre-existing ID. If user code setsc.req.requestId
for some reason, confirm you want to overwrite it with a new UUID or keep the one already set.server/distBE/db/seed.js (1)
7-7
: Validate logging usage of theSubsystem.Db
.Make sure your other code references and uses the correct subsystem name. If this subsystem key is inconsistent elsewhere, update it for consistency.
server/distBE/search/mappers.js (1)
139-167
: Verify grouping assumptions for multiple groups.
handleVespaGroupResponse
only handles the firstgroup:root
and firstgrouplist:app
. If multiple root or app groups appear in the response, they will be ignored. Confirm that this is intentional. Otherwise, consider iterating through all group entries.server/distBE/api/oauth.js (1)
40-44
: Confirm user existence checks align with your business logic.You are actively throwing a
NoUserFound
error upon an empty user array. Ensure the flow to create new users (if applicable) or handle guest access is in place, as this block terminates the callback otherwise.server/distBE/ai/provider/gemini.js (2)
19-26
: Confirm the intended role mapping for chat messages.In
converse
, the role remains as-is. InconverseStream
,"assistant"
is remapped to"model"
. Verify this difference is intentional to avoid inconsistent conversation flows.Also applies to: 56-59
39-41
: Check whether the text access is synchronous.
response.response.text()
may be asynchronous or a promise-based method, depending on the underlying SDK. Confirm that calling it directly returns a string; otherwise, it should be handled withawait
.server/distBE/ai/provider/bedrock.js (1)
11-14
: Validate the client object in the constructor.
Currently, the constructor accepts theclient
parameter without checks. Consider validating or defaulting this object to avoid null references or misconfiguration.server/distBE/server.js (1)
61-63
: Verify environment variable configurations.
You are loading environment variables viadotenv.config()
. Confirm that essential variables likeGOOGLE_CLIENT_ID
,GOOGLE_CLIENT_SECRET
, andJWT_SECRET
are set for all deployment environments to avoid runtime failures.server/distBE/integrations/google/gmail/quote-parser.js (3)
1-30
: Check for internationalized email cases.
The regexes defined for headers and signatures may fail to detect email patterns in other locales. Consider adding or testing against non-English header and signature lines if this parser is used globally.
31-59
: Fragment class implementation looks good.
The approach for reversing lines and storing them incontent
is straightforward and efficient. This class is well-defined for capturing email segments.
162-174
: Validate URL handling errors inreplaceLinks
.
Although you handle parsing errors with a try/catch, confirm that edge cases (e.g., unusual domain formats or partial matches) are correctly surfaced or sanitized.server/distBE/integrations/google/pdf-utils.js (1)
1-5
: Excellent setup of logging and error imports.
The import structure is well-organized, and leveragingSubsystem.Integrations
in the logger context provides clear debug context.server/distBE/ai/provider/openai.js (2)
1-10
: Robust import structure and class definition.
The setup forOpenAIProvider
properly extends theBaseProvider
and referencesmodelDetailsMap
. No concerns found here.
63-97
: Handle potentially undefined usage within chunked responses.
Ifchunk.usage
is unexpectedly absent, referencingchunk.usage.prompt_tokens
will cause errors. Include a safety check or default.if (chunk.usage && !costYielded) { costYielded = true; yield { text: "", metadata: "", - cost: calculateCost({ - inputTokens: chunk.usage.prompt_tokens, - outputTokens: chunk.usage.completion_tokens, - }, modelDetailsMap[modelParams.modelId].cost.onDemand), + cost: chunk.usage + ? calculateCost({ + inputTokens: chunk.usage.prompt_tokens || 0, + outputTokens: chunk.usage.completion_tokens || 0, + }, modelDetailsMap[modelParams.modelId].cost.onDemand) + : 0, }; }server/distBE/ai/provider/together.js (2)
23-23
: Handle emptyv.content
arrays.Here, we assume
v.content
always has an element at index 0. Ifv.content
is an empty array, attempting to accessv.content[0].text
will throw an error. Consider adding a fallback or checking for an empty array:...messages.map((v) => ({ - content: v.content ? v.content[0].text : "", + content: v.content && v.content[0] ? v.content[0].text : "", role: v.role, }))
34-36
: Check for emptychoices
in the response.Accessing
response.choices[0]
might throw ifchoices
is an empty array or missing. Consider a defensive check:- text: response.choices[0].message?.content || "", + text: response.choices?.[0]?.message?.content ?? "",server/distBE/api/admin.js (1)
126-128
: Validate form fields for service connections.Accessing
form["service-key"].text()
andform.email
can fail if these fields are missing or not properly formatted. Consider validating these properties before use to avoid runtime errors.server/distBE/integrations/google/sync.js (6)
81-126
: Validate concurrency handling for sheet deletions.
deleteUpdateStatsForGoogleSheets
iterates over potential missing sheets and deletes them. If multiple sync jobs run concurrently, race conditions may arise when deleting or adding sheets. Consider adding locks or additional checks to handle concurrent sync tasks safely.
127-169
: Confirm permission assumptions for spreadsheet deletion.
This approach deletes all sheets if a single user with sole permission has lost access. Verify that the user’s email always matchespermissions[0]
—otherwise, we throw an error. If any mismatch can happen in practice, handle partial removal or user mismatch more gracefully.
333-353
: Graceful handling of null return.
getDriveChanges
can returnnull
if an error occurs. Ensure the calling code thoroughly checks fornull
to avoid potential runtime errors or missed updates.
877-997
: Verify handling of label updates.
When labels are added/removed, code merges or filters them. Ensure concurrency does not cause the same label changes to be processed twice.
1367-1374
: Initialization function looks good.
newStats()
is a straightforward helper for counters, keeps code DRY.
1375-1381
: Stats merging is efficient.
mergeStats
is concise and meets typical aggregator needs.server/distBE/integrations/google/gmail-worker.js (2)
49-69
: Worker event logic is inactive if lines 1–23 remain commented out.
Ensure the necessary imports (e.g.,Logger
,MessageTypes
) are uncommented or provided ifself.onmessage
is to function fully. Otherwise, the code may reference undefined objects likeLogger
.
288-317
: Recursive body extraction is good.
getBody
gracefully recurses through nested parts. This is a robust approach for multi-part emails. Keep an eye on extremely deep nesting to avoid potential stack issues, but it’s likely acceptable for typical email structures.server/distBE/integrations/google/gmail/index.js (1)
110-231
: Robust error class usage.
EmailParsingError
is thrown ifFrom
is missing. Great approach. Also consider adding additional metadata for debugging (like the raw header contents).server/distBE/ai/provider/fireworksClient.js (2)
12-45
: Good use of AbortController with a timeout.The
_makeRequest
method correctly uses anAbortController
to handle timeouts, ensuring the request is canceled to prevent resource leaks or indefinite hangs. This is an excellent pattern for robust HTTP requests.
72-124
: Verify partial JSON chunk handling in streaming.The
streamComplete
generator reads and buffers data chunks until encountering a newline, then processes each line. Storing unused parts inbuffer
avoids splitting JSON mid-chunk. This is generally correct, but be aware that malformed JSON or large partial lines may still cause irregularities. Logging any parse failures is likewise helpful.server/distBE/utils.js (4)
10-27
: Great file existence check with fallback.
checkAndReadFile
efficiently checks for file existence and returnsnull
for missing files. This pattern gracefully handles optional configuration files. Implementation looks good.
43-57
: Cookie options for non-production environments.
setCookieByEnv
properly sets a less secure cookie for non-production. Confirm that no sensitive data is passed to front-ends in these environments.Are there usage scenarios where you might want HTTP-only cookies in staging or test environments? Consider whether additional warnings or environment checks are needed.
133-159
: Effective text cleanup utilities.The text-processing functions (
splitGroupedCitationsWithSpaces
,removeStopwords
) appropriately handle edge cases like empty or non-string inputs and invalid tokens. No issues spotted here.
92-132
:❓ Verification inconclusive
Recursive retry logic is effective; ensure googleOauth2Client is always defined.
The
retryWithBackoff
function implements exponential backoff nicely. Pay attention to the path whereapp
is a Google app butgoogleOauth2Client
isundefined
; it will cause an error at line 119. Ensure upstream code always supplies the OAuth2 client for that flow.
🏁 Script executed:
#!/bin/bash # Search for calls to retryWithBackoff where googleOauth2Client isn't passed rg --no-heading 'retryWithBackoff\([^,]*,[^,]*,Apps\.GoogleDrive\)'Length of output: 69
Action: Verify that a valid googleOauth2Client is always provided for Google app flows.
The recursive retry logic implementation in
server/distBE/utils.js
looks solid overall. However, the branch handling 401 errors for Google apps (line 119) depends on a definedgoogleOauth2Client
to refresh the OAuth token. Although our automated search for calls usingApps.GoogleDrive
didn’t produce output, this narrow query might miss other valid Google app invocations or different naming patterns. Please verify manually that every call toretryWithBackoff
for a Google app passes a properly defined OAuth2 client.
- Review the upstream invocations where
IsGoogleApp(app)
evaluates to true and confirm thatgoogleOauth2Client
is always non‑undefined.- Ensure that missing this parameter is caught early to avoid errors when the code reaches line 119.
server/distBE/ai/context.js (3)
1-35
: Consistent approach to constructing context for file and user entities.Both
constructFileContext
andconstructUserContext
share a consistent style (listing key fields, usinggetRelativeTime
for timestamps). This is clear and maintainable. Good job.
36-51
: Dynamic summary chunk slicing.
constructMailContext
usesfields.chunks_summary.slice(0, maxSummaryChunks)
, with a fallback formaxSummaryChunks
. This logic is straightforward. Ensure that your upstream code handles situations wherechunks_summary
might be undefined or empty.
212-230
: Validate additionalsearchResult.fields.sddocname
types.
answerContextMap
handles several known schemas. Confirm if you need a fallback for newly introduced schema types or if throwing an error is the desired behavior. This will help avoid unexpected crashes when new entity types are added.server/distBE/types.js (2)
10-15
: Consider more robust type transformations for boolean fields.
While usingx === "true"
works, it may silently fail for unexpected input (e.g., uppercase "TRUE"). Consider a stricter check or using.refine()
to ensure only valid boolean strings are accepted.Would you like me to verify the usage of
groupCount
transformations elsewhere to confirm correctness?
32-35
:❓ Verification inconclusive
Verify partial usage of
app
andentity
fields.
The.refine()
requirement ensuresapp
andentity
must appear together. Double-check that no callers pass only one of them unintentionally.
🏁 Script executed:
#!/bin/bash # Confirm that all calls either set both 'app' and 'entity' or neither ast-grep --pattern $'searchSchema(parse($_))'Length of output: 46
Manual Verification Required for Partial
app
/entity
Usage
The schema’s.refine()
logic correctly mandates that bothapp
andentity
must be provided together. However, our initial automated search didn't yield any usage instances, which might be due to the search pattern not matching all call formats. Please manually verify that no caller inadvertently supplies only one of these fields.
- Action Required:
Manually review all usages ofsearchSchema
(or its.parse
/validation calls) to ensure that bothapp
andentity
are always provided together or omitted altogether.server/distBE/db/user.js (2)
13-31
: Confirm handling of soft-deleted records.
The function retrieves user and workspace records without filtering out any possibledeletedAt
values. If soft deletes exist, consider filtering them out if that aligns with your requirements.
89-99
: Inconsistent not-found handling with other queries.
Here we throw an error if the user doesn’t exist, and parse withselectUserSchema
. This is good. Consider aligning other getter methods (e.g.,getUserByEmail
) with this approach for consistent error handling and schema validation.server/distBE/db/chat.js (3)
12-20
: Validate inserted chat.
The code inserts a chat and then parses only the first returned row. This is fine for single inserts, but consider handling unexpected empty arrays with a more descriptive error or fallback.
29-35
: Consistent error messages.
When the chat is not found, the error message is “Chat not found.” This is consistent, but ensure your consumer can differentiate between different resource types if more complex handling is needed.
78-88
: Consistent updatedAt logic.
updateMessageByExternalId
setsupdatedAt
to the current date. If other update functions do the same, confirm uniform usage and handling of these timestamps.server/distBE/ai/types.js (5)
4-12
: Naming and grouping providers.
YourAIProviders
enum looks well-defined. If you plan to add new providers frequently, consider grouping them logically or establishing a pattern for expansions.
50-54
: Add optional usage note for mention arrays.
mentionedNames
andmentionedEmails
might grow large. If you plan to store or process them further, consider performance or memory aspects.
55-59
: Rewrite schema coverage.
rewrittenQueries
is optional. Ensure that your logic gracefully handles cases where it’s absent, especially if subsequent code expects rewrites to exist.
67-72
: Uniform time filtering approach.
InFiltersSchema
, you allow optionalstartTime
/endTime
. Make sure the consumer code can handle when only one of these is provided or neither.
81-97
: Document the necessity for discriminated union.
It’s clear you’re using “type” as the key. Just ensure your devs know how to handle newly added query types, as they’ll need to extend this union.server/distBE/integrations/google/utils.js (1)
153-190
: Consider validating or sanitizing PDF file names prior to use.If
pdfFile?.name
can contain special characters or path manipulations, it might pose a risk. Sanitizing or validating the filename ensures safe file operations.server/distBE/api/chat.js (2)
140-158
: Validate that both messages and the chat record exist before deletion.Currently,
ChatDeleteApi
proceeds with deletes even if the chat or messages do not exist. Consider verifying their existence or adjusting the approach if a missing chat is a valid scenario.
792-888
: Guard against ambiguous partial JSON parsing.When accumulating chunks in the SSE stream, partial JSON can trigger parse errors. This logic waits to parse until the buffer is complete, but confirm that large or chunked responses don't cause partial parse attempts.
server/distBE/shared/types.js (1)
134-153
: Perform deeper validation for file-related data.
FileResponseSchema
picks and extends some essential fields. Consider validating optional fields (e.g.,mimeType
) to ensure they are valid content types, or handle them gracefully in the calling code.server/distBE/db/schema.js (2)
38-38
: Potential issue with unique constraint oncreatedBy
ThecreatedBy
field is marked asunique
, which will restrict multiple workspaces from being created by the same user. This might not be the intended functionality if a single user can create multiple workspaces.Do we expect only one workspace per creator? If not, consider removing or revisiting the unique constraint.
118-120
: Validate uniqueness of combination
TheuniqueConnector
constraint enforces a unique combination of(workspaceId, userId, app, authType)
. Verify that this aligns with your requirement, especially if a user could need multiple connector instances for the same app and auth type.server/distBE/search/vespaClient.js (1)
19-44
: Confirm Node fetch availability
This file uses the globalfetch
API, which is available in Node v18+ by default. If you plan to support older Node versions, ensure a polyfill likenode-fetch
is installed and imported.server/distBE/search/vespa.js (1)
481-481
: Potential undefined access fordocExist.fields
InupdateUserQueryHistory
, you accessdocExist.fields
without null checks. IfdocExist.fields
is unexpectedly undefined, this could throw an error.Would you like a defensive check to skip the logic if
docExist.fields
is missing?server/distBE/ai/provider/index.js (1)
231-291
: Robustness concerns in JSON parsing
jsonParseLLMOutput()
aggressively manipulates the returned text to isolate JSON. While this is a good approach to handle unpredictable responses, consider carefully validating the resulting structure for security and correctness. Large or malformed JSON could cause performance issues or unhandled parsing errors.Do you have unit tests that ensure this function behaves as expected when the text has extreme or deliberately malformed content? If not, I can provide a test script for verification.
server/distBE/ai/prompts.js (1)
8-12
: Assess privacy implications of user context
The promptaskQuestionUserPrompt
includes the user context, which might contain personal data (e.g., name, email). Confirm that exporting or echoing sensitive user details in the system prompts won't pose any privacy or compliance issues.server/distBE/errors/index.js (1)
34-36
: IntegrationOp enum declared but never used
TheIntegrationOp
enum is declared but not referenced in subsequent classes or logic, which can lead to confusion. If it’s intended for future use, consider leaving a comment about its purpose or removing it until needed.server/distBE/integrations/google/index.js (2)
1568-1646
: Extend coverage for empty contact data scenarios.In
listAllContacts()
, an empty or partial People API response might indicate account-level restrictions or missing permissions. Currently, the code loops until exhaustion. Consider adding a fallback or an alert mechanism if the user or domain typically has contacts but yields an unexpectedly empty response.Do you want me to generate a script that queries logs or user IDs to confirm whether these partial responses occur frequently?
1719-1748
: Validate contact mapping logic for special cases.
insertContactsToVespa()
attempts to handle partial or missing IDs/emails gracefully. Ensure specialized scenarios—like distribution lists without a unique email—are documented or filtered out. This helps maintain consistent indexing without accidental collisions in doc IDs.server/distBE/search/types.js (2)
104-118
: Validate optional fields inVespaFileSchema
.Some fields (
url
,owner
,ownerEmail
,photoLink
,mimeType
) are marked nullable or optional. Ensure your indexing or searching logic can handle these null values gracefully (e.g., no unexpected string conversions, logs, or search indexing issues).
130-162
: Enhance robust validation for user schemas.
name
is optional. If a user has no name in certain contexts (e.g. external user), ensure your UI or queries handle theundefined
name
field properly. Consider a fallback or label if it's missing.
server/distBE/ai/contants.js
Outdated
@@ -0,0 +1 @@ | |||
export {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix filename typo and add actual constants.
The filename appears to be misspelled - "contants.js" instead of "constants.js". Additionally, the file only exports an empty object with no actual constants defined.
Consider:
- Renaming the file to "constants.js"
- Adding the actual AI-related constants that will be used throughout the application
-export {};
+export const DEFAULT_MODEL = "gpt-3.5-turbo";
+export const DEFAULT_TEMPERATURE = 0.7;
+export const MAX_TOKENS = 4096;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
export {}; | |
export const DEFAULT_MODEL = "gpt-3.5-turbo"; | |
export const DEFAULT_TEMPERATURE = 0.7; | |
export const MAX_TOKENS = 4096; |
server/distBE/cleanup.js
Outdated
const attributesToRemove = [ | ||
"style", | ||
"src", | ||
"alt", | ||
"title", | ||
"role", | ||
"aria-", | ||
"tabindex", | ||
"on", | ||
"data-", | ||
]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Reevaluate removing accessibility attributes
Eliminating aria-
attributes can break accessibility for screen readers and other assistive technologies. Consider retaining critical ARIA attributes or implementing a whitelist approach.
server/distBE/vespa/replaceDIMS.js
Outdated
const getSchemaDefinitionPaths = (directory) => { | ||
return fs | ||
.readdirSync(directory) | ||
.filter((file) => path.extname(file) === ".sd") | ||
.map((file) => path.join(directory, file)); | ||
}; | ||
const paths = getSchemaDefinitionPaths("./schemas"); | ||
replaceModelDIMS(paths, parseInt(args[0])); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Ensure directory existence
If ./schemas
is missing or empty, the script might fail silently or produce no changes. Confirm the directory and file presence before proceeding.
🏁 Script executed:
#!/bin/bash
if [ -d "./schemas" ]; then
ls -la ./schemas
else
echo "No schemas directory found."
fi
Length of output: 87
Action Required: Add Directory Existence Check
After running the verification script, it’s confirmed that the expected "./schemas" directory does not exist. This can lead to a situation where the script fails silently or produces no changes when it attempts to read from a non-existent directory. To resolve this issue, please update the code in server/distBE/vespa/replaceDIMS.js (lines 32-39) to explicitly check for the directory’s existence (and optionally its non-emptiness) before proceeding. For example:
if (!fs.existsSync("./schemas")) {
console.error("Error: The './schemas' directory does not exist.");
process.exit(1);
}
Then proceed with:
const paths = getSchemaDefinitionPaths("./schemas");
replaceModelDIMS(paths, parseInt(args[0]));
This update will ensure that missing or empty directories are caught early and handled gracefully.
nextPageToken = res.data.nextPageToken ?? "" | ||
newSyncTokenCalendarEvents = res.data.nextSyncToken ?? "" | ||
} while (nextPageToken) | ||
if (events.length === 0) { | ||
return { events: [], calendarEventsToken: newSyncTokenCalendarEvents } | ||
} | ||
const confirmedEvents = events.filter((e) => e.status === "confirmed") | ||
// Handle cancelledEvents separately | ||
const cancelledEvents = events.filter((e) => e.status === "cancelled") | ||
// First insert only the confirmed events | ||
for (const event of confirmedEvents) { | ||
const { baseUrl, joiningUrl } = getJoiningLink(event) | ||
const { attendeesInfo, attendeesEmails, attendeesNames } = | ||
getAttendeesOfEvent(event.attendees ?? []) | ||
const { attachmentsInfo, attachmentFilenames } = getAttachments( | ||
event.attachments ?? [], | ||
) | ||
const { isDefaultStartTime, startTime } = getEventStartTime(event) | ||
const eventToBeIngested = { | ||
docId: event.id ?? "", | ||
name: event.summary ?? "", | ||
description: getTextFromEventDescription(event?.description ?? ""), | ||
url: event.htmlLink ?? "", // eventLink, not joiningLink | ||
status: event.status ?? "", | ||
location: event.location ?? "", | ||
createdAt: new Date(event.created).getTime(), | ||
updatedAt: new Date(event.updated).getTime(), | ||
app: Apps.GoogleCalendar, | ||
entity: CalendarEntity.Event, | ||
creator: { | ||
email: event.creator?.email ?? "", | ||
displayName: event.creator?.displayName ?? "", | ||
}, | ||
organizer: { | ||
email: event.organizer?.email ?? "", | ||
displayName: event.organizer?.displayName ?? "", | ||
}, | ||
attendees: attendeesInfo, | ||
attendeesNames: attendeesNames, | ||
startTime: startTime, | ||
endTime: new Date(event.end?.dateTime).getTime(), | ||
attachmentFilenames, | ||
attachments: attachmentsInfo, | ||
recurrence: event.recurrence ?? [], // Contains recurrence metadata of recurring events like RRULE, etc | ||
baseUrl, | ||
joiningLink: joiningUrl, | ||
permissions: getUniqueEmails([ | ||
event.organizer?.email ?? "", | ||
...attendeesEmails, | ||
]), | ||
cancelledInstances: [], | ||
defaultStartTime: isDefaultStartTime, | ||
} | ||
await insert(eventToBeIngested, eventSchema) | ||
updateUserStats(userEmail, StatType.Events, 1) | ||
} | ||
// Add the cancelled events into cancelledInstances array of their respective main event | ||
for (const event of cancelledEvents) { | ||
// add this instance to the cancelledInstances arr of main recurring event | ||
// don't add it as seperate event | ||
const instanceEventId = event.id ?? "" | ||
const splittedId = instanceEventId?.split("_") ?? "" | ||
const mainEventId = splittedId[0] | ||
const instanceDateTime = splittedId[1] | ||
try { | ||
// Get the main event from Vespa | ||
// Add the new instanceDateTime to its cancelledInstances | ||
const eventFromVespa = await GetDocument(eventSchema, mainEventId) | ||
const oldCancelledInstances = | ||
eventFromVespa.fields.cancelledInstances ?? [] | ||
if (!oldCancelledInstances?.includes(instanceDateTime)) { | ||
// Do this only if instanceDateTime not already inside oldCancelledInstances | ||
const newCancelledInstances = [ | ||
...oldCancelledInstances, | ||
instanceDateTime, | ||
] | ||
if (eventFromVespa) { | ||
await UpdateEventCancelledInstances( | ||
eventSchema, | ||
mainEventId, | ||
newCancelledInstances, | ||
) | ||
} | ||
} | ||
} catch (error) { | ||
Logger.error( | ||
error, | ||
`Main Event ${mainEventId} not found in Vespa to update cancelled instance ${instanceDateTime} of ${instanceEventId}`, | ||
) | ||
} | ||
} | ||
if (!newSyncTokenCalendarEvents) { | ||
throw new CalendarEventsListingError({ | ||
message: "Could not get sync tokens for Google Calendar Events", | ||
integration: Apps.GoogleCalendar, | ||
entity: CalendarEntity.Event, | ||
}) | ||
} | ||
return { events, calendarEventsToken: newSyncTokenCalendarEvents } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider chunking or streaming large event batches to reduce memory usage.
insertCalendarEvents()
fetches and processes potentially thousands of events up to a year in advance. Consider partial insertion or streaming patterns—especially when events over multiple years or multiple user calendars are handled—to avoid large array concatenations in memory.
const confirmedEvents = events.filter((e) => e.status === "confirmed");
-// current approach: for-of iteration on entire confirmedEvents array
+// suggested approach: process in batches to avoid storing large arrays in memory
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
server/shared/config.ts (1)
1-3
: Good addition of dotenv for configuration managementThe dotenv implementation allows the serverUrl to be configurable through environment variables, enhancing flexibility. This is consistent with the pattern used across the project.
Consider using a server-specific naming convention for environment variables instead of the React-specific
REACT_APP_
prefix, as this is server-side code.server/db/schema.ts (1)
27-28
: Appropriate use of dotenv for critical environment variablesAdding dotenv here is especially important as this file handles database schema and encryption keys. The implementation follows a fail-fast approach by checking for required environment variables immediately after loading.
There's a slight redundancy in line 30 where you use the non-null assertion operator (
!
) and then explicitly check if the variable is not set in lines 31-33. Consider using a consistent approach, either with assertion or explicit checks, but not both.server/nodeServer.ts (2)
89-110
: Watch for potential redirect loops when token is invalid.If
AuthRedirect
determines the token is invalid (lines 102-109), it delegates back to/auth
, which might route control back here if not carefully handled. Ensure no cyclical redirection occurs for certain edge cases (e.g., partially expired tokens).
386-389
: Evaluate graceful shutdown strategies for uncaught exceptions.Currently, an uncaught exception logs the error but does not stop or restart the server. Consider gracefully shutting down or automatically restarting to avoid the system staying in an unknown state.
server/integrations/google/index.ts (2)
114-116
: Bundle environment variable loading in a single place.You import
dotenv
and calldotenv.config()
again on lines 115-116, possibly duplicating usage from other files. Re-initializing environment variables multiple times typically has no side effects, but consider centralizing the logic to reduce confusion and possibility of misconfigurations.
123-128
: Abstract environment detection logic for clarity.You rely on
typeof process !== "undefined" && !("Bun" in globalThis)
to decide betweenNodeWorker
and the defaultWorker
. This can be brittle if other runtimes differ in their global definitions, or if additional JS environments (e.g., Deno) are introduced. Consider a helper function that detects the runtime environment, making the code more scalable and explicit.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
server/ai/provider/index.ts
(1 hunks)server/config.ts
(1 hunks)server/db/schema.ts
(1 hunks)server/integrations/google/gmail-worker.ts
(1 hunks)server/integrations/google/gmail/index.ts
(1 hunks)server/integrations/google/index.ts
(4 hunks)server/nodeServer.ts
(1 hunks)server/package.json
(3 hunks)server/shared/config.ts
(1 hunks)server/utils.ts
(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- server/integrations/google/gmail/index.ts
- server/integrations/google/gmail-worker.ts
- server/ai/provider/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- server/package.json
🧰 Additional context used
🧬 Code Definitions (2)
server/nodeServer.ts (1)
server/utils.ts (1)
setCookieByEnv
(49-67)
server/integrations/google/index.ts (1)
server/logger/index.ts (1)
getLogger
(24-49)
🔇 Additional comments (5)
server/utils.ts (1)
10-11
: Good addition of dotenv for environment variable managementAdding dotenv to load environment variables from a .env file is a good practice for Node.js applications. This makes the codebase more configurable and aligns with the PR objective of making server JS files run in Node.
server/config.ts (1)
2-3
: Well-implemented dotenv configurationThe addition of dotenv is particularly valuable in this configuration file, which heavily relies on environment variables for model configuration, host settings, and API credentials.
server/nodeServer.ts (3)
72-72
: Consider validating environment variable presence before using!
operator.Using
!
(non-null assertion) onprocess.env.JWT_SECRET
can cause runtime errors if the variable is missing. You may opt to throw an error or provide a fallback whenprocess.env.JWT_SECRET
is undefined.Would you like a shell script to search for all similar non-null uses of environment variables and flag them?
80-83
: Reevaluate JWT-based middleware usage for OAuth callback endpoints.You are applying
AuthMiddleware
to routes like/oauth/callback
; typically, OAuth callbacks do not require existing credentials. If the user has no token yet, this might cause unwanted redirects. Verify that this flow doesn’t block first-time sign-ins.
312-315
: Ensure built frontend files exist at the specified paths.Routes serving static files (e.g.,
"/chat"
,"/auth"
) reference./dist/index.html
. Double-check that these files exist in production builds and that environment-based paths or an SSR approach won't be needed.
server/integrations/google/index.ts
Outdated
let writer; | ||
if (typeof process !== "undefined" && !("Bun" in globalThis)) { | ||
// Node.js environment | ||
writer = fs.createWriteStream(filePath) | ||
} else { | ||
const file = Bun.file(filePath) | ||
writer = file.writer() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Mirror the runtime detection for file-writing logic.
The conditional creation of a fs.createWriteStream
vs. Bun.file
is consistent with how you differentiate worker usage. However, ensure concurrency constraints (e.g., simultaneous writes) are handled correctly. Also consider whether fallback logic is needed for other runtimes that are neither Node nor Bun.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
server/integrations/google/index.ts (1)
118-119
: Remove @ts-ignore and use proper typing for the html-to-text importThe
@ts-ignore
comment suggests there might be a type issue with thehtmlToText
import. Consider resolving the type issue properly instead of suppressing it.-// @ts-ignore -import * as htmlToText from "html-to-text" +import { convert } from "html-to-text"Later in the code, you can replace
htmlToText.convert()
with justconvert()
.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
server/config.ts
(1 hunks)server/db/schema.ts
(1 hunks)server/integrations/google/gmail/index.ts
(1 hunks)server/integrations/google/index.ts
(3 hunks)server/package.json
(3 hunks)server/utils.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- server/config.ts
- server/utils.ts
- server/integrations/google/gmail/index.ts
- server/package.json
- server/db/schema.ts
🧰 Additional context used
🪛 GitHub Actions: TypeScript Build Check
server/integrations/google/index.ts
[error] 833-833: error TS2339: Property 'onmessage' does not exist on type 'Worker | Worker'.
🔇 Additional comments (2)
server/integrations/google/index.ts (2)
113-115
: Added Node.js compatibility with worker_threads and dotenvThe changes add support for running in a Node.js environment by importing the Node.js worker implementation and adding dotenv configuration. This aligns with the PR objective to run server JS files in Node.
1717-1725
: Mirror the runtime detection for file-writing logicThe conditional creation of a
fs.createWriteStream
vs.Bun.file
is consistent with how you differentiate worker usage.
server/integrations/google/index.ts
Outdated
let gmailWorker | ||
if (typeof process !== "undefined" && !("Bun" in globalThis)) { | ||
gmailWorker = new NodeWorker(new URL("./gmail-worker.js", import.meta.url)) | ||
} else { | ||
gmailWorker = new Worker(new URL("gmail-worker.ts", import.meta.url).href) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Runtime detection for worker creation
Good approach for conditional worker creation based on the runtime environment. However, the NodeWorker URL path points to a JS file while the fallback uses a TS file. Make sure the compilation process generates both files correctly.
Verify that the compiled JS file exists in the right location:
🏁 Script executed:
#!/bin/bash
# Check if the JS worker file exists
if [ -f server/integrations/google/gmail-worker.js ]; then
echo "✓ JS worker file exists"
else
echo "✗ JS worker file doesn't exist"
fi
# Check if the TS worker file exists
if [ -f server/integrations/google/gmail-worker.ts ]; then
echo "✓ TS worker file exists"
else
echo "✗ TS worker file doesn't exist"
fi
Length of output: 242
JS Worker File Missing for Node Environment
- The runtime detection logic is correctly distinguishing between environments.
- However, the compiled JS version (
gmail-worker.js
) is missing in the expected location (server/integrations/google
), while the TS file exists. - Please review your build process to ensure that
gmail-worker.ts
is compiled to generategmail-worker.js
for non-Bun environments or adjust the instantiation accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
server/nodeServer.ts (4)
90-111
: Potential for XSRF concerns with cookie-based auth.
Redirect-based checks are straightforward, but consider implementing CSRF or state-based tokens (especially on form submissions) to further mitigate cross-site request forgery.
170-193
: Revisit the 2-month JWT expiry.
The token expiration is set to 60 days, which might be too long for higher-security contexts. Consider an environment-based or user-configurable token lifespan.Example shorter expiry (e.g. 24 hours):
- exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 60 + exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24
310-335
: Potential route consolidation for static files.
The code individually registers routes for "/", "/chat", "/auth", etc., all serving the sameindex.html
. You could consolidate these definitions to reduce repetition if the requirement to redirect or apply distinct middleware is minimal.
386-389
: Finalize graceful shutdown plan.
TheuncaughtException
handler logs errors but does not attempt a graceful shutdown. Consider adding a cleanup routine or SIGTERM listener.+ process.on("SIGTERM", async () => { + Logger.warn("Received SIGTERM, starting graceful shutdown...") + server.close(() => { + Logger.info("Server closed. Exiting...") + process.exit(0) + }) + })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
server/integrations/slack/index.ts
(2 hunks)server/nodeServer.ts
(1 hunks)server/package.json
(4 hunks)
✅ Files skipped from review due to trivial changes (1)
- server/integrations/slack/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- server/package.json
🧰 Additional context used
🧬 Code Definitions (1)
server/nodeServer.ts (1)
server/utils.ts (1)
setCookieByEnv
(49-67)
🔇 Additional comments (5)
server/nodeServer.ts (5)
1-14
: Imports look good overall.
The module imports appear to be organized and relevant. No immediate issues or duplicates are detected.
67-74
: Ensure critical environment variables are defined.
The code relies onprocess.env.GOOGLE_CLIENT_ID
,process.env.GOOGLE_CLIENT_SECRET
, andprocess.env.JWT_SECRET
with non-null assertions, which will throw an error at runtime if the environment variables are missing.Please consider verifying these variables on startup or providing fallback/error messaging. For instance:
if (!process.env.GOOGLE_CLIENT_ID || !process.env.GOOGLE_CLIENT_SECRET || !process.env.JWT_SECRET) { throw new Error("Missing required environment variables.") }
80-83
: Cookie-based JWT approach looks consistent.
Usingcookie: CookieName
in thejwt
middleware effectively checks for tokens. Ensure secure cookie settings are enforced in production (e.g.,secure
,httpOnly
) at a global level if not already handled.
204-309
: Verify domain logic & user creation flow.
The code gracefully handles existing users and workspaces, then creates new ones if not found. This seems correct, but ensure you have test coverage for edge cases (e.g., invalid or non-corporate email domains, partial user data).To confirm correctness, you could run a specialized test script or unit/integration tests that simulate various login scenarios with different hosted domains and user states.
361-382
: Re-examine WebSocket connection logic.
When multiple clients share the same connector ID, thewsConnections.set(connectorId, ws)
call will overwrite existing WebSocket references. Confirm whether this is intentional or if you need to handle multiple simultaneous connections per connector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
server/integrations/google/gmail-worker.ts (2)
31-32
: Remove or justify the//@ts-ignore
usage.Currently,
//@ts-ignore
bypasses type checking for thehtmlToText
import. Consider installing or configuring type definitions so you can safely remove the ignore comment and ensure type-safety throughout the file.
88-91
: Confirm completeness of error handling for Node.js worker.You're logging the worker's errors, which is a good start. Consider whether these errors need to bubble up or trigger any specific fallback or cleanup actions. If not, this is fine as is.
server/integrations/google/index.ts (2)
118-119
: Avoid//@ts-ignore
if possible.Similar to
gmail-worker.ts
, installing proper type definitions or refactoring the import for"html-to-text"
would make the code more robust.
1746-1753
: Cross-environment file writer logic is well-defined.Switching to
fs.createWriteStream
in Node orBun.file(...).writer()
otherwise facilitates a flexible approach. Confirm large file writes and concurrency are tested in both environments to avoid partial writes or performance bottlenecks.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
server/integrations/google/gmail-worker.ts
(3 hunks)server/integrations/google/index.ts
(3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/integrations/google/index.ts (1)
server/logger/index.ts (1)
getLogger
(24-49)
🔇 Additional comments (6)
server/integrations/google/gmail-worker.ts (3)
44-44
: Node.js Worker import looks good.The import from
"worker_threads"
is necessary for Node.js execution. Just confirm that the rest of this file is not bundled for non-Node runtimes, to avoid runtime exceptions.
69-87
: ValidateparentPort
usage in Node environment.Using
parentPort
inside theif (typeof process !== "undefined" && !("Bun" in globalThis))
check is correct for a Node-based worker. Ensure no messages are lost ifparentPort
is unexpectedly null or undefined. Additionally, confirm that the rest of the code handles concurrency or message queuing as needed for large-scale message processing.
92-116
: Fallback toself.onmessage
in non-Node environments looks correct.The browser-like logic with
self.onmessage
andself.onerror
is consistent. Just ensure that no race conditions occur when switching between environment checks. If you ever expand to additional runtimes (e.g., Deno), revisit this guard.server/integrations/google/index.ts (3)
113-115
: New imports for Node worker and environment config appear correct.Using
Worker as NodeWorker
from"worker_threads"
and loading.env
is standard for Node environments. Ensure.env
usage is idempotent if this file is imported multiple times.
122-139
: Conditional worker logic for Node vs. Bun vs. browser is well-structured.This ensures that a
.js
worker is used in Node and a.ts
worker is used otherwise. Verify that the build process reliably generatesgmail-worker.js
in the same folder, preventing file-not-found issues in production.
839-876
: Worker event handling is consistent withgmail-worker.ts
.Centralizing the logic in
setupGmailWorkerHandler
is good design. Confirm concurrency or parallel usage is handled gracefully (e.g., if multiple trackers attach simultaneously). Otherwise, this looks healthy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (7)
server/nodeServer.ts (7)
1-69
: Imports look well-structured, but consider organizing them better.The imports are comprehensive and include all necessary dependencies for the Node server implementation. However, consider grouping related imports together (e.g., all Hono-related imports, all authentication-related imports) to improve readability.
68-69
: Consider moving dotenv configuration to the beginning of the file.For environment variables to be properly loaded before any code uses them, the dotenv configuration should ideally be at the very top of the file, before any imports that might use environment variables.
+import dotenv from "dotenv" +dotenv.config() import { type Context, Hono, type Next } from "hono" import { AnswerApi, AutocompleteApi, // ... other imports } -import dotenv from "dotenv" -dotenv.config()
210-210
: Consider reducing JWT token expiration time.The current token expiration is set to 60 days, which is quite long from a security perspective. Consider reducing this to a more reasonable timeframe (e.g., 1-7 days) and implementing a refresh token mechanism.
- exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 60, // Token expires in 2 months + exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 7, // Token expires in 7 days
248-249
: Remove@ts-ignore
and properly type the user object.Using
@ts-ignore
suppresses TypeScript errors which might hide potential issues. Instead, properly type the user object to handle thehd
property.- // @ts-ignore - let domain = user.hd + // Define a more complete type for the Google user object + let domain = (user as { hd?: string }).hd
357-362
: Handle initialization errors more gracefully.The current initialization immediately throws an error on failure, which would crash the application. Consider implementing a more graceful error handling approach.
export const init = async () => { await initQueue() } -init().catch((error) => { - throw new InitialisationError({ cause: error }) -}) +init().catch((error) => { + Logger.error( + error, + "Failed to initialize the application. Attempting to continue with limited functionality." + ) + // Consider implementing fallback behavior or graceful degradation +})
407-410
: Implement graceful shutdown for uncaught exceptions.The current error handler for uncaught exceptions only logs the error. Consider implementing a graceful shutdown mechanism to prevent the application from continuing in an inconsistent state.
process.on("uncaughtException", (error) => { Logger.error(error, "uncaughtException") - // shutdown server? + // Attempt to close resources gracefully + Logger.info("Initiating graceful shutdown due to uncaught exception") + + // Close the WebSocket server + wss.close(() => { + Logger.info("WebSocket server closed") + + // Close the HTTP server + server.close(() => { + Logger.info("HTTP server closed") + + // Exit the process with an error code + process.exit(1) + }) + }) + + // Force exit if graceful shutdown takes too long + setTimeout(() => { + Logger.error("Forced shutdown due to timeout") + process.exit(2) + }, 10000) // 10 seconds timeout })
1-411
: Consider breaking down this large file into smaller modules.This file handles multiple responsibilities including routing, authentication, WebSocket handling, and server initialization. Consider splitting it into smaller, more focused modules to improve maintainability.
Potential modules:
auth.ts
- Authentication middleware and logicroutes.ts
- API routes configurationwebsocket.ts
- WebSocket server setup and handlersserver.ts
- Main server initialization and configuration
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore
(1 hunks)server/nodeServer.ts
(1 hunks)server/package.json
(4 hunks)server/tsconfig.json
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- server/tsconfig.json
- server/package.json
🧰 Additional context used
🧬 Code Definitions (1)
server/nodeServer.ts (1)
server/utils.ts (1)
setCookieByEnv
(49-67)
🔇 Additional comments (1)
.gitignore (1)
122-122
: Appropriate addition for server build output directory.The addition of
distServer
to .gitignore is consistent with the project structure and aligns with the PR objective to run server JS files in Node. This directory will contain the compiled server-side TypeScript files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
server/server.ts (4)
67-73
: Consider loading environment variables earlier.
Callingdotenv.config()
near the top of the file is a common convention to ensure environment variables are available before usage.
345-345
: Handle missing file scenario.
If./dist/index.html
doesn’t exist at runtime, serving will fail silently. Consider adding a fallback or error handling for production robustness.
348-352
: Static serving repeated across routes.
You may reduce duplication by consolidating routes that serve the sameindex.html
file under a single pattern-based handler.
447-480
: Unify Bun vs Node WSS logic and consider authentication.
Both blocks duplicate WebSocket setup code. Optionally consolidate into a shared utility and add authentication checks for consistent security, matching the rest of the app’s Auth approach.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
server/server.ts
(3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/server.ts (1)
server/integrations/metricStream.ts (1)
wsConnections
(3-3)
🔇 Additional comments (9)
server/server.ts (9)
65-66
: No issues with importingStatusCode
.
It is used in the typed WebSocket route definitions below, so it’s valid to keep.
336-342
: Check support for top-level await.
This dynamic import approach is valid if your runtime supports top-level await. Otherwise, consider refactoring to an async initialization function or ensuring an ESM environment.
347-347
: Routing looks consistent.
No authentication is required on/auth
as expected.
356-356
: OK to serve index on chat route.
No evident issues.
361-362
: No concerns here.
Auth is applied consistently on/integrations
.
366-367
: Matches the existing pattern.
Nothing unusual.
368-368
: Serving/oauth/success
is fine.
This route looks correct.
371-371
: Asset serving is straightforward.
Ensure the./dist
folder is properly populated during build.
380-403
: Verify usage ofWsApp
in Node environment.
WsApp
is only assigned in the Bun branch. If this is intended, confirm that no references toWsApp
exist for Node. Otherwise, consider a fallback assignment for Node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
server/server.ts (1)
391-433
:⚠️ Potential issueWebSocket connections in Node lack authentication.
This open
/ws
endpoint may be a security concern if your application expects authenticated clients. Implement authentication (e.g., checking tokens) before upgrading to websockets.Consider adding authentication middleware similar to what's used for your HTTP routes:
server.on("upgrade", (req, socket, head) => { const { pathname } = parse(req.url, true) if (pathname === "/ws") { + // Extract and verify auth token from request + const cookies = req.headers.cookie?.split(';').reduce((acc, cookie) => { + const [key, value] = cookie.trim().split('='); + acc[key] = value; + return acc; + }, {}); + + const authToken = cookies?.[CookieName]; + + if (!authToken) { + socket.write('HTTP/1.1 401 Unauthorized\r\n\r\n'); + socket.destroy(); + return; + } + + try { + // Verify JWT token + const decoded = jwt.verify(authToken, jwtSecret); + // Store user info for later use + req.user = decoded; // If the request is for the /ws endpoint, upgrade the connection. wss.handleUpgrade(req, socket, head, (ws) => { wss.emit("connection", ws, req) }) + } catch (err) { + socket.write('HTTP/1.1 401 Unauthorized\r\n\r\n'); + socket.destroy(); + } } else { // Otherwise, destroy the socket. socket.destroy() } })
🧹 Nitpick comments (1)
server/server.ts (1)
367-389
: Simplify the WebSocket type definition.The type definition for
WsApp
is overly complex for its usage and contains duplicate types. Consider simplifying it for better maintainability.-export let WsApp: Hono< - { - Variables: Variables - }, - { - "/ws": { - $get: - | { - input: {} - output: {} - outputFormat: "ws" - status: StatusCode - } - | { - input: {} - output: {} - outputFormat: "ws" - status: StatusCode - } - } - }, - "/" -> +export let WsApp: Hono< + { + Variables: Variables + }, + { + "/ws": { + $get: { + input: {} + output: {} + outputFormat: "ws" + status: StatusCode + } + } + }, + "/" +>
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
server/server.ts
(3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/server.ts (1)
server/integrations/metricStream.ts (1)
wsConnections
(3-3)
🔇 Additional comments (3)
server/server.ts (3)
65-73
: Good job adding Node.js environment support.The imports for Node.js-specific packages and the dotenv configuration look good. This change effectively enables the server to run in a Node.js environment in addition to Bun.
336-345
: Well-structured conditional static file serving.The approach to conditionally load the appropriate static file middleware based on the runtime environment is good. The different path structures (
../
prefix in Node.js) properly account for the differences in how working directories are handled in the two environments.
338-339
:❓ Verification inconclusive
Verify Node.js static file paths.
Please confirm that the static file paths for Node.js are correct relative to the server's execution context. The paths include
../
which assumes the server runs from a subdirectory.
🏁 Script executed:
#!/bin/bash # Check the execution directory and relative paths # Print the current working directory echo "Current working directory: $(pwd)" # List directories to verify structure echo "Directory structure:" ls -la # Check if these paths exist echo "Checking for ../dist/index.html:" test -f "../dist/index.html" && echo "File exists" || echo "File does not exist" echo "Checking for ./dist/index.html:" test -f "./dist/index.html" && echo "File exists" || echo "File does not exist"Length of output: 1956
Action: Verify Node.js Static Asset Paths in Production
- The execution test was run from the repository root (/home/jailuser/git), which shows that neither
../dist/index.html
nor./dist/index.html
exists in that context.- If the server is intended to run from the
server
directory, the relative path../dist/index.html
would resolve from that subdirectory (assuming adist
folder exists at the repository root). Please confirm that the server’s working directory is correctly set when it is launched.- If the server runs from the repository root or another location, the provided paths will not locate the static asset. Adjust the paths accordingly or change the working directory at runtime so that
nodeServeStatic({ root: "../", path: "./dist/index.html" })
correctly resolves to the built static file.
WsApp = app.get( | ||
"/ws", | ||
upgradeWebSocket((c) => { | ||
let connectorId: string | undefined | ||
return { | ||
onOpen(event, ws) { | ||
connectorId = c.req.query("id") | ||
Logger.info(`Websocket connection with id ${connectorId}`) | ||
wsConnections.set(connectorId, ws) | ||
}, | ||
onMessage(event, ws) { | ||
Logger.info(`Message from client: ${event.data}`) | ||
ws.send(JSON.stringify({ message: "Hello from server!" })) | ||
}, | ||
onClose: (event, ws) => { | ||
Logger.info("Connection closed") | ||
if (connectorId) { | ||
wsConnections.delete(connectorId) | ||
} | ||
}, | ||
} | ||
}), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for Bun WebSocket connections.
Similar to the Node.js implementation, the Bun WebSocket implementation should also include error handling.
WsApp = app.get(
"/ws",
upgradeWebSocket((c) => {
let connectorId: string | undefined
return {
onOpen(event, ws) {
connectorId = c.req.query("id")
Logger.info(`Websocket connection with id ${connectorId}`)
wsConnections.set(connectorId, ws)
},
onMessage(event, ws) {
Logger.info(`Message from client: ${event.data}`)
ws.send(JSON.stringify({ message: "Hello from server!" }))
},
onClose: (event, ws) => {
Logger.info("Connection closed")
if (connectorId) {
wsConnections.delete(connectorId)
}
},
+ onError: (event, ws) => {
+ Logger.error(`WebSocket error: ${event.message}`)
+ if (connectorId) {
+ wsConnections.delete(connectorId)
+ }
+ },
}
}),
)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
WsApp = app.get( | |
"/ws", | |
upgradeWebSocket((c) => { | |
let connectorId: string | undefined | |
return { | |
onOpen(event, ws) { | |
connectorId = c.req.query("id") | |
Logger.info(`Websocket connection with id ${connectorId}`) | |
wsConnections.set(connectorId, ws) | |
}, | |
onMessage(event, ws) { | |
Logger.info(`Message from client: ${event.data}`) | |
ws.send(JSON.stringify({ message: "Hello from server!" })) | |
}, | |
onClose: (event, ws) => { | |
Logger.info("Connection closed") | |
if (connectorId) { | |
wsConnections.delete(connectorId) | |
} | |
}, | |
} | |
}), | |
) | |
WsApp = app.get( | |
"/ws", | |
upgradeWebSocket((c) => { | |
let connectorId: string | undefined | |
return { | |
onOpen(event, ws) { | |
connectorId = c.req.query("id") | |
Logger.info(`Websocket connection with id ${connectorId}`) | |
wsConnections.set(connectorId, ws) | |
}, | |
onMessage(event, ws) { | |
Logger.info(`Message from client: ${event.data}`) | |
ws.send(JSON.stringify({ message: "Hello from server!" })) | |
}, | |
onClose: (event, ws) => { | |
Logger.info("Connection closed") | |
if (connectorId) { | |
wsConnections.delete(connectorId) | |
} | |
}, | |
onError: (event, ws) => { | |
Logger.error(`WebSocket error: ${event.message}`) | |
if (connectorId) { | |
wsConnections.delete(connectorId) | |
} | |
}, | |
} | |
}), | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
server/server.ts (2)
391-433
:⚠️ Potential issueAdd error handling for WebSocket connections.
The Node.js WebSocket implementation doesn't include error handling for the socket connections. This could lead to unhandled exceptions and potential server crashes.
wss.on("connection", (ws, req) => { const { query } = parse(req.url!, true) const connectorId = query.id Logger.info(`WebSocket connection with id ${connectorId}`) if (connectorId) { wsConnections.set(connectorId, ws) } ws.on("message", (message) => { Logger.info(`Message from client: ${message}`) ws.send(JSON.stringify({ message: "Hello from server!" })) }) ws.on("close", () => { Logger.info("Connection closed") if (connectorId) { wsConnections.delete(connectorId) } }) + + ws.on("error", (error) => { + Logger.error(error, "WebSocket connection error") + if (connectorId) { + wsConnections.delete(connectorId) + } + }) })
444-466
:⚠️ Potential issueAdd error handling for Bun WebSocket connections.
Similar to the Node.js implementation, the Bun WebSocket implementation should also include error handling.
WsApp = app.get( "/ws", upgradeWebSocket((c) => { let connectorId: string | undefined return { onOpen(event, ws) { connectorId = c.req.query("id") Logger.info(`Websocket connection with id ${connectorId}`) wsConnections.set(connectorId, ws) }, onMessage(event, ws) { Logger.info(`Message from client: ${event.data}`) ws.send(JSON.stringify({ message: "Hello from server!" })) }, onClose: (event, ws) => { Logger.info("Connection closed") if (connectorId) { wsConnections.delete(connectorId) } }, + onError: (event, ws) => { + Logger.error(`WebSocket error: ${event.message}`) + if (connectorId) { + wsConnections.delete(connectorId) + } + }, } }), )server/integrations/google/index.ts (1)
1749-1756
: Smart conditional file writing implementation.The code now handles file writing operations in both Node.js and Bun environments, using
fs.createWriteStream
for Node.js andBun.file(filePath).writer()
for Bun.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
server/ai/provider/index.ts
(1 hunks)server/config.ts
(1 hunks)server/integrations/google/index.ts
(3 hunks)server/package.json
(4 hunks)server/server.ts
(3 hunks)server/utils.ts
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- server/config.ts
- server/utils.ts
- server/ai/provider/index.ts
- server/package.json
🧰 Additional context used
🧬 Code Definitions (2)
server/integrations/google/index.ts (1)
server/logger/index.ts (1)
getLogger
(24-49)
server/server.ts (1)
server/integrations/metricStream.ts (1)
wsConnections
(3-3)
🔇 Additional comments (5)
server/server.ts (3)
67-73
: Good addition of Node.js server dependencies.The PR adds necessary imports to support running the server in Node.js environment, including the Node server implementation, WebSocket support, and URL parsing utilities.
336-345
: Good implementation of conditional static file serving.The code intelligently selects the appropriate static file serving middleware based on the runtime environment. Note the path differences between Node.js (
../dist
) and Bun (./dist
) environments, which accommodates the different working directory contexts in each runtime.
367-389
: Type declaration updated correctly for WsApp.The change from
const
tolet
forWsApp
makes sense since its value is conditionally assigned later in the code based on runtime environment.server/integrations/google/index.ts (2)
114-116
: Good addition of Node.js worker and environment setup.Added the necessary imports to support running the application in Node.js, using the native
worker_threads
module and configuring environment variables withdotenv
.
842-879
: Good implementation of cross-runtime worker message handling.The function
setupGmailWorkerHandler
is well-implemented to handle worker messages in both Node.js and Bun environments, correctly using the appropriate event handling patterns for each runtime.
server/integrations/google/index.ts
Outdated
let gmailWorker | ||
if (typeof process !== "undefined" && !("Bun" in globalThis)) { | ||
gmailWorker = new NodeWorker(new URL("gmail-worker.js", import.meta.url)) | ||
Logger.info("Gmail worker initialized") | ||
gmailWorker.on("error", (error) => { | ||
Logger.error( | ||
error, | ||
`[NODE] Error in main thread: worker: ${JSON.stringify(error)}`, | ||
) | ||
}) | ||
} else { | ||
gmailWorker = new Worker(new URL("gmail-worker.ts", import.meta.url).href) | ||
Logger.info("Gmail worker initialized") | ||
gmailWorker.onerror = (error: ErrorEvent) => { | ||
Logger.error( | ||
error, | ||
`Error in main thread: worker: ${JSON.stringify(error)}`, | ||
) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Well-structured conditional worker initialization.
The code now correctly detects the runtime environment and initializes the Gmail worker accordingly, using the appropriate Worker implementation and file type (JS for Node.js, TS for Bun).
Run the following script to verify that the compiled JS worker file exists for the Node environment:
🏁 Script executed:
#!/bin/bash
# Check if the JS worker file exists
if [ -f server/integrations/google/gmail-worker.js ]; then
echo "✓ JS worker file exists"
else
echo "✗ JS worker file doesn't exist - make sure build process generates this file"
fi
# Check if the TS worker file exists
if [ -f server/integrations/google/gmail-worker.ts ]; then
echo "✓ TS worker file exists"
else
echo "✗ TS worker file doesn't exist"
fi
Length of output: 334
Attention: Missing Compiled JS Worker File for Node Environment
The worker initialization logic is well structured and correctly differentiates between the Node and Bun environments. However, our verification shows that the compiled JavaScript worker file (server/integrations/google/gmail-worker.js
) is missing. This file is required for the Node environment execution. Please ensure that your build process generates the gmail-worker.js
file, or adjust the worker initialization to point to the correct file if the expected output is different.
- Action Required: Update the build process to compile and emit
gmail-worker.js
for Node.js, or confirm that the intended file is correctly referenced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (1)
server/server.ts (1)
445-468
: AddonError
handler to the Bun WebSocket.
This is a duplicate of a previous suggestion regarding error handling for Bun WebSocket.
🧹 Nitpick comments (4)
server/search/types.ts (1)
57-60
: Add a brief doc comment for the Env enum.
This helps other developers quickly understand the usage context of each environment type.docs/README.md (1)
7-21
: Streamline instructions to address redundancy.
Replace “Inside of the `server` folder” with “Inside the `server` folder” to align with the style suggestion from the static analysis tool and remove redundant wording.- - Inside of the `server` folder, instead of running `bun run dev` + - Inside the `server` folder, instead of running `bun run dev`🧰 Tools
🪛 LanguageTool
[style] ~11-~11: This phrase is redundant. Consider using “Inside”.
Context: ...cept the step of starting the server. - Inside of theserver
folder, instead of running...(OUTSIDE_OF)
server/server.ts (1)
67-73
: Consider using the WHATWG URL API.
Node’surl.parse
is deprecated in favor of theURL
constructor. Switching tonew URL(req.url, ...)
can provide better reliability.server/integrations/google/gmail-worker.ts (1)
33-33
: Remove ts-ignore if possible.
Consider installing type definitions for "html-to-text" or adjusting the import to avoid the//@ts-ignore
directive and ensure type-safety.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
docs/README.md
(1 hunks)server/ai/provider/index.ts
(1 hunks)server/integrations/google/gmail-worker.ts
(4 hunks)server/integrations/google/index.ts
(5 hunks)server/package.json
(3 hunks)server/search/types.ts
(1 hunks)server/server.ts
(4 hunks)server/utils.ts
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- server/ai/provider/index.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- server/package.json
- server/utils.ts
🧰 Additional context used
🧬 Code Graph Analysis (2)
server/server.ts (2)
server/utils.ts (1)
getEnvironment
(12-18)server/integrations/metricStream.ts (1)
wsConnections
(3-3)
server/integrations/google/index.ts (2)
server/utils.ts (1)
getEnvironment
(12-18)server/logger/index.ts (1)
getLogger
(24-49)
🪛 LanguageTool
docs/README.md
[style] ~11-~11: This phrase is redundant. Consider using “Inside”.
Context: ...cept the step of starting the server. - Inside of the server
folder, instead of running...
(OUTSIDE_OF)
🔇 Additional comments (18)
server/server.ts (11)
49-49
: LGTM for new import.
No issues spotted. The reference togetEnvironment
andsetCookieByEnv
clarifies the environment usage.
65-66
: LGTM for new StatusCode import.
No concerns regarding the addition ofStatusCode
.
74-74
: LGTM on environment detection.
This line clearly conveys the environment is determined at runtime.
337-347
: Verify the static file paths.
Ensuring that the relative paths../
and./dist
remain correct after build is crucial. If your Node environment runs from a different working directory, these might fail.
349-349
: Confirm whether authentication is desired for root path.
So that the landing page becomes accessible only to authorized users. If this is intentional, proceed; otherwise, consider removing theAuthRedirect
.
351-351
: LGTM on allowing open access to auth route.
This approach ensures that the login page remains accessible without requiring prior authentication.
353-353
: LGTM on protecting the /search route.
User must be authenticated before searching, which is appropriate for restricting private data access.
355-355
: LGTM on parameterized chat route.
This ensures that any chat detail path remains protected via authentication.
356-356
: LGTM on requiring authentication for admin integrations.
Restricting admin routes is important for security.
359-359
: LGTM on unprotected static assets.
Serving static assets without authentication is standard practice.
368-390
: Confirm the need for a mutable export.
Exporting aslet
is correct if you're reassigningWsApp
in the Bun environment block; otherwise, preferconst
.server/integrations/google/gmail-worker.ts (4)
10-10
: LGTM for importing Env.
No issues spotted.
31-31
: LGTM for unified environment logic import.
Centralizing environment-related logic is a sound approach.
45-45
: LGTM for worker_threads import.
Enabling Node-specific features for the worker aligns with environment detection logic.
70-119
: LGTM on environment-based message handling.
Separate flows for Node.js (parentPort
) and the defaultself.onmessage
ensure broader portability of the Gmail worker.server/integrations/google/index.ts (3)
29-29
: Environment detection imports and setup look good.The imports for environment detection (
getEnvironment
,Env
) and Node worker support (Worker as NodeWorker
) have been properly added. The ES module syntax for the HTML-to-text library is also appropriate for better compatibility.Also applies to: 72-72, 115-117, 120-121
750-757
: Environment-aware file writer creation.The conditional creation of a writer based on the runtime environment is consistent with how the worker is handled, using
fs.createWriteStream
for Node andBun.file
for Bun.
844-879
: Robust event handler setup for cross-runtime workers.The worker message handling is properly set up for both environments, using the appropriate event binding methods (
on("message")
for Node vsonmessage
for Bun) and correctly handles type casting.
if (environment === Env.Node) { | ||
const server = serve({ fetch: app.fetch, port: config.port }) | ||
|
||
const wss = new WebSocketServer({ noServer: true }) | ||
|
||
server.on("upgrade", (req, socket, head) => { | ||
const { pathname } = parse(req.url, true) | ||
|
||
if (pathname === "/ws") { | ||
// If the request is for the /ws endpoint, upgrade the connection. | ||
wss.handleUpgrade(req, socket, head, (ws) => { | ||
wss.emit("connection", ws, req) | ||
}) | ||
} else { | ||
// Otherwise, destroy the socket. | ||
socket.destroy() | ||
} | ||
}) | ||
|
||
wss.on("connection", (ws, req) => { | ||
const { query } = parse(req.url!, true) | ||
const connectorId = query.id | ||
|
||
Logger.info(`WebSocket connection with id ${connectorId}`) | ||
|
||
if (connectorId) { | ||
wsConnections.set(connectorId, ws) | ||
} | ||
|
||
ws.on("message", (message) => { | ||
Logger.info(`Message from client: ${message}`) | ||
ws.send(JSON.stringify({ message: "Hello from server!" })) | ||
}) | ||
|
||
ws.on("close", () => { | ||
Logger.info("Connection closed") | ||
if (connectorId) { | ||
wsConnections.delete(connectorId) | ||
} | ||
}) | ||
}) | ||
Logger.info(`[NODE] listening on port: ${config.port}`) | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for Node.js WebSocket connections.
The Node.js WebSocket implementation doesn’t include an error
event listener. This could lead to unhandled exceptions and potential server crashes during edge cases.
const { createBunWebSocket } = await import("hono/bun") | ||
const { upgradeWebSocket, websocket } = createBunWebSocket<ServerWebSocket>() | ||
const server = Bun.serve({ | ||
fetch: app.fetch, | ||
port: config.port, | ||
websocket, | ||
idleTimeout: 180, | ||
}) | ||
Logger.info(`[BUN] listening on port: ${config.port}`) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Consider adding an onError
handler for Bun WebSocket.
Similar to the Node environment, the Bun WebSocket might need explicit error handling for stable operation.
let gmailWorker | ||
if (environment === Env.Node) { | ||
gmailWorker = new NodeWorker(new URL("gmail-worker.js", import.meta.url)) | ||
Logger.info("Gmail worker initialized") | ||
gmailWorker.on("error", (error) => { | ||
Logger.error( | ||
error, | ||
`[NODE] Error in main thread: worker: ${JSON.stringify(error)}`, | ||
) | ||
}) | ||
} else { | ||
gmailWorker = new Worker(new URL("gmail-worker.ts", import.meta.url).href) | ||
Logger.info("Gmail worker initialized") | ||
gmailWorker.onerror = (error: ErrorEvent) => { | ||
Logger.error( | ||
error, | ||
`Error in main thread: worker: ${JSON.stringify(error)}`, | ||
) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Well-structured conditional worker initialization.
The code now correctly detects the runtime environment and initializes the Gmail worker accordingly, using the appropriate Worker implementation and file type (JS for Node.js, TS for Bun).
Run the following script to verify that the compiled JS worker file exists for the Node environment:
🏁 Script executed:
#!/bin/bash
# Check if the JS worker file exists
if [ -f server/integrations/google/gmail-worker.js ]; then
echo "✓ JS worker file exists"
else
echo "✗ JS worker file doesn't exist - make sure build process generates this file"
fi
# Check if the TS worker file exists
if [ -f server/integrations/google/gmail-worker.ts ]; then
echo "✓ TS worker file exists"
else
echo "✗ TS worker file doesn't exist"
fi
Length of output: 334
Attention: Missing Compiled JS Worker File in Node Environment
- The conditional initialization of the Gmail worker is well-structured, distinguishing between the Node and non-Node environments.
- However, the verification script indicates that the required
gmail-worker.js
file (used in the Node environment viaNodeWorker
) does not exist inserver/integrations/google/
. - The TypeScript worker file (
gmail-worker.ts
) is present, meaning the build process is not generating the corresponding compiled JS file for Node. - Action Required: Please ensure that the build process generates the
gmail-worker.js
file or adjust the Node worker initialization to load the appropriate file if this was an oversight.
Summary by CodeRabbit
New Features
Refactor
Documentation