Skip to content

feat(server): add support for running server in Node #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

kalpadhwaryu
Copy link
Contributor

@kalpadhwaryu kalpadhwaryu commented Apr 1, 2025

Summary by CodeRabbit

  • New Features

    • Streamlined build and runtime processes now support flexible execution environments.
    • Enhanced worker messaging and error handling improve cross-environment reliability.
  • Refactor

    • Optimized WebSocket connections and static file serving for smoother performance.
    • Modernized module import practices for improved consistency.
  • Documentation

    • Updated guidelines now include instructions for running the server using Node.

Copy link

coderabbitai bot commented Apr 1, 2025

Walkthrough

The changes update the server’s build and runtime environment by modifying dependencies, compiler options, and scripts. TypeScript configuration now outputs to a dedicated folder while certain options are disabled. Various modules have been refactored to adopt ES module syntax and environment-aware logic. New utility functions and an Env enum have been introduced to detect whether the code is running in Node or Bun. The WebSocket and file serving logic are adjusted based on the runtime, and the documentation is updated with instructions for running the server in Node.

Changes

File(s) Change Summary
server/package.json, server/tsconfig.json Reintroduced and added devDependencies (e.g., @faker-js/faker, tsc-alias), added new scripts (build:server, nodeServer), configured outDir as ./distServer, and commented out certain TypeScript compiler options.
server/ai/provider/index.ts Inserted an empty line after import statements (formatting change only).
server/integrations/google/gmail/index.ts, server/integrations/google/gmail-worker.ts Updated html-to-text import from CommonJS to ES module (with // @ts-ignore), and in the worker file, added environment-aware message handling using parentPort vs. self.onmessage with enhanced error logging.
server/integrations/google/index.ts Introduced environment detection to conditionally instantiate a NodeWorker or a browser Worker and modified the downloadPDF function (parameter reordering and writer creation logic).
server/integrations/slack/index.ts Removed direct import of retryPolicies and instead imported the entire module to destructure retryPolicies.
server/utils.ts, server/search/types.ts Added a new getEnvironment function to detect the runtime and introduced the Env enum with Bun and Node members; updated import statements accordingly.
.gitignore Added a new entry (distServer) to ignore the dedicated build output directory.
server/server.ts Modified WebSocket handling by conditionally using the ws package in Node versus Bun’s implementation, updated static file serving, and adjusted variable declarations (WsApp) accordingly.
docs/README.md Added a new "Run server in Node" section with instructions for building and running the server using Node.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Process
    participant EnvChecker as getEnvironment
    participant Worker as Google Worker

    Client->>Worker: Send task/message
    Worker->>EnvChecker: Check runtime environment
    alt Running in Node
        Worker->>parentPort: Handle message via parentPort.on("message")
    else Browser
        Worker->>self: Handle message via self.onmessage
    end
    Worker-->>Client: Return response
Loading
sequenceDiagram
    participant Browser as Client Browser
    participant EnvChecker as Environment Checker
    participant Server as WebSocket Server

    Browser->>Server: Request WebSocket connection (/ws)
    Server->>EnvChecker: Determine runtime environment
    alt Node Environment
        Server->>Server: Initialize WebSocketServer (using ws package)
    else Bun Environment
        Server->>Server: Use Bun static file serving & WebSocket implementation
    end
    Server-->>Browser: Establish WebSocket connection
Loading

Poem

I'm a coding bunny, hopping through the bytes,
Skipping through dependencies on bright, starry nights.
With a twitch of my nose, I align Node and Bun,
Crafting logic so smooth, it's playful and fun.
Leaping through scripts and config with flair,
I celebrate these changes with a joyful hare!

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @kalpadhwaryu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces the ability to run server-side JavaScript files using Node.js. It includes the addition of necessary dependencies, updates to package configurations, and modifications to context construction and evaluation processes to support this functionality. The changes primarily focus on enabling server-side execution of AI-related tasks.

Highlights

  • Dependency Injection: Added @hono/node-server as a dependency to enable running server JS files in Node.
  • Context Construction: Modified context construction functions in server/distBE/ai/context.js to enhance information provided to the AI, including details about the user, workspace, and current time.
  • Evaluation Process: Updated the evaluation process in server/distBE/ai/eval.js to incorporate permission-aware evaluations and various RAG (Retrieval-Augmented Generation) strategies.
  • Model Mapping: Added new AI models and their corresponding cost details in server/distBE/ai/mappers.js, along with mappings to their respective providers.
  • Prompt Engineering: Introduced and modified prompts in server/distBE/ai/prompts.js to optimize AI responses, including query rewriting and temporal event classification.

Changelog

Click here to see the changelog
  • package-lock.json
    • Added @hono/node-server dependency and its related dependencies.
  • package.json
    • Added @hono/node-server dependency.
  • server/distBE/ai/contants.js
    • Added an empty export object.
  • server/distBE/ai/context.js
    • Modified context construction functions to include more relevant information.
    • Added colored context construction for better readability.
    • Implemented context cleaning functions to remove irrelevant characters and highlights.
  • server/distBE/ai/eval.js
    • Implemented permission-aware evaluations.
    • Added various RAG strategies, including basic, iterative, and time-expansion RAG.
    • Integrated query rewriting and temporal event classification.
    • Simplified search result simplification for better context retrieval.
  • server/distBE/ai/mappers.js
    • Added new AI models and their cost details.
    • Implemented mappings between models and providers.
  • server/distBE/ai/prompts.js
    • Introduced and modified prompts for optimized AI responses.
    • Added prompts for query rewriting, temporal event classification, and end-to-end integration.
  • server/distBE/ai/provider/base.js
    • Added base provider class with model parameter handling.
  • server/distBE/ai/provider/bedrock.js
    • Implemented Bedrock provider with streaming and non-streaming conversation capabilities.
  • server/distBE/ai/provider/fireworks.js
    • Implemented Fireworks provider with streaming and non-streaming conversation capabilities.
  • server/distBE/ai/provider/fireworksClient.js
    • Implemented Fireworks client for making API requests.
  • server/distBE/ai/provider/gemini.js
    • Implemented Gemini provider with streaming and non-streaming conversation capabilities.
  • server/distBE/ai/provider/index.js
    • Consolidated provider initialization and access logic.
    • Implemented various AI tasks, including question answering, query analysis, and query rewriting.
  • server/distBE/ai/provider/ollama.js
    • Implemented Ollama provider with streaming and non-streaming conversation capabilities.
  • server/distBE/ai/provider/openai.js
    • Implemented OpenAI provider with streaming and non-streaming conversation capabilities.
  • server/distBE/ai/provider/together.js
    • Implemented Together provider with streaming and non-streaming conversation capabilities.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


In nodes where servers reside,
JavaScript's power cannot hide.
AI's tasks ignite,
With code taking flight,
A digital, clever guide.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request adds the ability to run server-side JavaScript files using Node.js. This includes adding necessary dependencies and creating new files for AI-related functionalities. The changes seem well-structured, but there are a few areas that could benefit from closer examination.

Summary of Findings

  • Missing email and workspace ID: The myEmail and workspaceId variables in server/distBE/ai/eval.js are initialized as empty strings, which will cause issues if the code relies on these values. These should be populated with actual values or retrieved from a configuration.
  • Incomplete error handling: The converseStream function in server/distBE/ai/provider/bedrock.js has a try-catch block, but it only logs the error and rethrows it. Consider adding more robust error handling, such as providing a default response or gracefully terminating the stream.
  • Potential security risk: The apiKey variable in server/distBE/ai/provider/fireworksClient.js is hardcoded. Storing API keys directly in the code is a security risk. These should be stored securely, such as in environment variables or a dedicated secrets management system.
  • Missing cost calculation: The converse and converseStream functions in server/distBE/ai/provider/fireworks.js have a hardcoded cost of 0. The cost calculation logic should be implemented to accurately reflect the usage of the Fireworks AI service.

Merge Readiness

The pull request has several issues that need to be addressed before merging. The missing email and workspace ID, incomplete error handling, potential security risk, and missing cost calculation are all significant concerns that should be resolved to ensure the code's correctness, security, and maintainability. I am unable to approve this pull request, and recommend that it not be merged until these issues are addressed.

const Logger = getLogger(Subsystem.AI);
export class FireworksProvider extends BaseProvider {
constructor(client) {
super(client, AIProviders.Fireworks);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The apiKey variable is hardcoded. Storing API keys directly in the code is a security risk. These should be stored securely, such as in environment variables or a dedicated secrets management system.

Suggested change
super(client, AIProviders.Fireworks);
apiKey = process.env.FIREWORKS_API_KEY || "YOUR_API_KEY"; // Retrieve from environment variable

Comment on lines 28 to 30
const myEmail = "";
// workspace external Id
const workspaceId = "";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

These variables are initialized as empty strings. If the code relies on these values, it will cause issues. Consider populating these with actual values or retrieving them from a configuration.

Suggested change
const myEmail = "";
// workspace external Id
const workspaceId = "";
const myEmail = process.env.MY_EMAIL || "[email protected]";
const workspaceId = process.env.WORKSPACE_ID || "defaultWorkspaceId";

max_tokens: modelParams.maxTokens,
stream: false,
});
const cost = 0; // Explicitly setting 0 as cost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The cost is hardcoded to 0. Implement the cost calculation logic to accurately reflect the usage of the Fireworks AI service.

cost: calculateCost(inputTokens, outputTokens, modelDetailsMap[modelParams.modelId].cost.onDemand), // Implement cost calculation

})) {
yield {
text: chunk,
cost: 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The cost is hardcoded to 0. Implement the cost calculation logic to accurately reflect the usage of the Fireworks AI service.

cost: calculateCost(inputTokens, outputTokens, modelDetailsMap[modelParams.modelId].cost.onDemand), // Implement cost calculation

Comment on lines 123 to 125
catch (error) {
Logger.error(error, "Error in converseStream of Bedrock");
throw error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The converseStream function has a try-catch block, but it only logs the error and rethrows it. Consider adding more robust error handling, such as providing a default response or gracefully terminating the stream.

catch (error) {
    Logger.error(error, "Error in converseStream of Bedrock");
    yield { text: "An error occurred while processing the stream.", metadata: "error" }; // Provide a default response
    // throw error; // Rethrow the error to stop further processing
  }

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 54

🧹 Nitpick comments (154)
package.json (1)

1-5: Minimal package.json configuration.

This package.json file only includes a single dependency on @hono/node-server. Consider enhancing it with additional standard fields like name, version, description, scripts, and engines to better document the project and provide convenience commands.

 {
   "dependencies": {
     "@hono/node-server": "^1.14.0"
+  },
+  "name": "your-project-name",
+  "version": "0.1.0",
+  "description": "Description of your project",
+  "scripts": {
+    "start": "node server/distBE/main.js",
+    "dev": "node --watch server/distBE/main.js"
   }
 }
server/distBE/utils/index.js (2)

1-9: Add documentation and make date formatting configurable.

The getDateForAI function uses a hardcoded "en-GB" locale which may not be suitable for all deployments.

+/**
+ * Returns the current date formatted for AI interactions
+ * @param {string} locale - The locale to use for date formatting, defaults to "en-GB"
+ * @returns {string} Formatted date string
+ */
-export function getDateForAI() {
+export function getDateForAI(locale = "en-GB") {
    const today = new Date();
    const options = {
        day: "2-digit",
        month: "2-digit",
        year: "numeric",
    };
-    return today.toLocaleDateString("en-GB", options);
+    return today.toLocaleDateString(locale, options);
}

10-14: Add input validation and documentation to cost calculation.

The calculateCost function lacks input validation and documentation.

+/**
+ * Calculates the cost of AI operations based on input and output tokens
+ * @param {Object} params - The token parameters
+ * @param {number} params.inputTokens - Number of input tokens used
+ * @param {number} params.outputTokens - Number of output tokens used
+ * @param {Object} cost - The cost configuration
+ * @param {number} cost.pricePerThousandInputTokens - Price per 1000 input tokens
+ * @param {number} cost.pricePerThousandOutputTokens - Price per 1000 output tokens
+ * @returns {number} The calculated cost
+ */
-export const calculateCost = ({ inputTokens, outputTokens }, cost) => {
+export const calculateCost = ({ inputTokens = 0, outputTokens = 0 } = {}, cost) => {
+    // Validate inputs
+    if (!cost || typeof cost !== 'object') {
+        throw new Error('Cost configuration is required');
+    }
+    if (!cost.pricePerThousandInputTokens || !cost.pricePerThousandOutputTokens) {
+        throw new Error('Price per thousand tokens is required for both input and output');
+    }
    const inputCost = (inputTokens / 1000) * cost.pricePerThousandInputTokens;
    const outputCost = (outputTokens / 1000) * cost.pricePerThousandOutputTokens;
    return inputCost + outputCost;
};
server/distBE/vespa/replaceDIMS.js (2)

1-9: Validate negative or zero dimension
Currently, only non-numeric dimensions are rejected. Including a check for negative or zero values can prevent invalid replacements.


22-31: Consider more robust error handling
Catching errors individually is fine, but you might prefer a centralized approach or continue processing remaining files after logging the error.

server/distBE/doc.js (3)

1-13: Potential duplication with existing utility
cleanText appears similar to the function in server/distBE/chunks.js. Consider consolidating into a shared helper to avoid repetitive code.


32-47: Handle missing alt text
If both title and alt are absent, consider falling back to a default string for accessibility.


171-205: Potential formatting edge cases
Merging lines with spaces might break certain specialized formatting like code blocks. Testing with more complex text structures is recommended.

server/distBE/api/auth.js (1)

1-5: Improve imports and add proper error handling.

The file imports an empty object from schema.js which is unnecessary and imports a config but only destructures a single property while other files destructure multiple properties from the same config.

 import config from "../config.js"
 import { db } from "../db/client.js"
 import { getPublicUserAndWorkspaceByEmail } from "../db/user.js"
-import {} from "../db/schema.js"
 const { JwtPayloadKey } = config;
server/distBE/ai/provider/base.js (4)

1-1: Remove unused AWS SDK import

The AWS SDK Bedrock client is imported but not used anywhere in the code. Removing unused imports helps reduce bundle size and improves code clarity.

-import {} from "@aws-sdk/client-bedrock-runtime";

2-2: Uncomment or remove commented code

There's a commented import for AIProviders. Either uncomment it if it's needed or remove it completely. Commented code adds noise and can confuse future developers.

-// import { AIProviders } from "../../ai/types";

12-23: Parameter naming inconsistency in getModelParams method

The method accepts parameters with snake_case (e.g., max_new_tokens) but returns properties in camelCase (e.g., maxTokens). This inconsistency could lead to confusion when using the API.

Consider standardizing on a single naming convention, preferably camelCase for JavaScript:

getModelParams(params) {
    return {
-        maxTokens: params.max_new_tokens || 512,
-        topP: params.top_p || 0.9,
+        maxTokens: params.maxTokens || params.max_new_tokens || 512,
+        topP: params.topP || params.top_p || 0.9,
        temperature: params.temperature || 0.6,
        modelId: params.modelId || defaultFastModel,
        systemPrompt: params.systemPrompt || "You are a helpful assistant.",
        userCtx: params.userCtx,
        stream: params.stream,
        json: params.json || null,
    };
}

19-19: Add default value for userCtx parameter

Unlike other parameters, userCtx doesn't have a default value. Consider providing a default (even if it's null) to maintain consistency with other parameters.

-        userCtx: params.userCtx,
+        userCtx: params.userCtx || null,
server/distBE/db/client.js (1)

8-10: Consider adding a connection pool with appropriate timeout

Setting idle_timeout: 0 disables the connection timeout, which means connections will be kept open indefinitely. This could lead to database connection exhaustion if not properly managed. Consider using a connection pool with appropriate timeout settings.

const queryClient = postgres(url, {
-    idle_timeout: 0,
+    idle_timeout: 30, // Close idle connections after 30 seconds
+    max: 10, // Maximum number of connections in the pool
});
server/distBE/api/proxy.js (2)

5-5: Improve error message with specific details

The error message "Failed to fetch the image" is too generic and assumes the resource is an image, which might not always be the case. Provide more specific error information.

-        return c.text("Failed to fetch the image.", 502);
+        return c.text(`Failed to fetch resource: ${response.status} ${response.statusText}`, 502);

11-11: Consider validating or restricting content types

The proxy currently accepts any content type, which could be a security risk. Consider validating or restricting the content types that can be proxied.

+        // Validate content type
+        const contentType = response.headers.get("Content-Type") || "application/octet-stream";
+        const allowedContentTypes = ["image/jpeg", "image/png", "image/gif", "image/svg+xml", "application/pdf"];
+        if (!allowedContentTypes.includes(contentType.split(";")[0])) {
+            return c.text(`Content type not allowed: ${contentType}`, 415);
+        }
-            "Content-Type": response.headers.get("Content-Type") || "application/octet-stream",
+            "Content-Type": contentType,
server/distBE/utils/encryption.js (2)

1-14: Encryption implementation looks solid but consider adding try/catch blocks.

The class constructor properly validates the encryption key length (256-bit), which is critical for security. The error message is clear and helpful for debugging.

Consider adding more robust error handling around key validation:

constructor(key) {
    this.key = key;
-    if (key &&
-        Buffer.from(key, this.encoding).byteLength !==
-            this.encryptionKeyByteLength) {
-        throw new Error("Encryption key must be base64-encoded and 256-bit long.");
-    }
+    if (!key) {
+        throw new Error("Encryption key is required.");
+    }
+    try {
+        if (Buffer.from(key, this.encoding).byteLength !== this.encryptionKeyByteLength) {
+            throw new Error("Encryption key must be base64-encoded and 256-bit long.");
+        }
+    } catch (error) {
+        throw new Error(`Invalid encryption key format: ${error.message}`);
+    }
}

18-29: Encrypt method is well-implemented but lacks input validation.

The encryption method correctly implements AES-256-GCM with a random IV and authentication tag, which is a secure approach.

Add input validation to handle edge cases:

encrypt(str) {
+    if (typeof str !== 'string') {
+        throw new Error("Input must be a string.");
+    }
    const iv = crypto.randomBytes(12);
    const cipher = crypto.createCipheriv(this.algo, Buffer.from(this.key, this.encoding), iv);
    let enc = cipher.update(str, "utf8", this.encoding);
    enc += cipher.final(this.encoding);
    const authTag = cipher.getAuthTag().toString(this.encoding);
    return JSON.stringify({
        ciphertext: enc,
        iv: iv.toString(this.encoding),
        authTag: authTag,
    });
}
server/distBE/mime.js (1)

187-189: Consider enhancing the MIME type lookup function.

The function works correctly but could be improved with better error handling and documentation.

Add JSDoc and improve error handling:

/**
 * Returns the search keyword for a given MIME type
 * @param {string} mimeType - The MIME type to look up
 * @returns {string} The corresponding search keyword or empty string if not found
 */
-export const getSingleMimetypeKeyword = (mimeType) => {
-    return mimeTypeToSearchKeyword[mimeType] || "";
+export const getSingleMimetypeKeyword = (mimeType) => {
+    if (!mimeType || typeof mimeType !== 'string') {
+        return "";
+    }
+    return mimeTypeToSearchKeyword[mimeType] || "";
};
server/distBE/ai/mappers.js (1)

1-1: Consider simplifying the import path.

The import path could be simplified since you're already in the ai directory.

-import { AIProviders, Models } from "../ai/types.js"
+import { AIProviders, Models } from "../types.js"
server/distBE/db/syncJob.js (3)

34-36: Fix error message in updateSyncJob function

The error message refers to "connector" instead of "sync job", which is misleading and inconsistent with the function's purpose.

- throw new Error("Could not update the connector");
+ throw new Error("Could not update the sync job");

39-41: Fix error message in updateSyncJob validation

Similar to the previous issue, this error message also incorrectly refers to "connector" instead of "sync job".

- throw new Error(`zod error: Invalid connector: ${parsedRes.error.toString()}`);
+ throw new Error(`zod error: Invalid sync job: ${parsedRes.error.toString()}`);

37-38: Rename variable to match entity type

For consistency and clarity, rename the variable to reflect that it's a sync job, not a connector.

- const [connectorVal] = updatedSyncJobs;
- const parsedRes = selectSyncJobSchema.safeParse(connectorVal);
+ const [syncJobVal] = updatedSyncJobs;
+ const parsedRes = selectSyncJobSchema.safeParse(syncJobVal);
server/distBE/config.js (2)

4-21: Consider using environment variables for all configuration values

While most configuration values are properly derived from environment variables, some values like port numbers are hardcoded. Consider making these configurable through environment variables as well.

- let port = 3000;
+ let port = process.env.PORT || 3000;

- let host = "http://localhost:3000";
+ let host = process.env.HOST || "http://localhost:3000";

if (process.env.NODE_ENV === "production") {
    postgresBaseHost = process.env.DATABASE_HOST;
    vespaBaseHost = process.env.VESPA_HOST;
-    port = 80;
+    port = process.env.PROD_PORT || 80;
    host = process.env.HOST;
    redirectUri = process.env.GOOGLE_PROD_REDIRECT_URI;
}

97-136: Consider organizing configuration into logical sections

The configuration object contains many properties. Consider organizing these into logical sections using nested objects for better readability and maintenance.

export default {
+   // Database configuration
+   database: {
    postgresBaseHost,
    vespaBaseHost,
+   },
+   // Server configuration
+   server: {
    port,
    host,
    redirectUri,
    postOauthRedirect,
+   },
+   // AI model configuration
+   models: {
    AwsAccessKey,
    AwsSecretKey,
    OpenAIKey,
    OllamaModel,
    TogetherAIModel,
    TogetherApiKey,
    FireworksAIModel,
    FireworksApiKey,
    GeminiAIModel,
    GeminiApiKey,
    defaultBestModel,
    defaultFastModel,
    isReasoning,
    fastModelReasoning,
    StartThinkingToken,
    EndThinkingToken,
+   },
    // ... continue with other logical groupings
};
server/distBE/eval/qrelsToTrecFormat.js (1)

9-10: Ensure imports and logger usage are consistent
You are calling getLogger(Subsystem.Eval) at line 9 but only importing both getLogger and Subsystem at lines 25-26. Consider moving the imports so they appear before usage, and consistently use either require or ES import statements throughout the file.

Also applies to: 25-26

server/distBE/db/workspace.js (2)

25-36: Use transaction object or remove it from the signature
You’re passing trx into functions like getWorkspaceById, but calling db.select() instead of trx.select(). For transactional consistency, make sure to use the transaction object or remove it from the function arguments if not needed.


38-50: Fix misleading error message
In getWorkspaceByCreatedBy, the error thrown is "Could not find workspaces by domain" instead of a message indicating the missing workspace for the createdBy field. Update this message to correctly reflect the lookup failure.

server/distBE/ai/provider/fireworks.js (1)

1-1: Remove unused import
The import statement import {} from "@aws-sdk/client-bedrock-runtime"; is unused. Consider removing it to keep the file clean.

server/distBE/db/customType.js (1)

7-35: Consider handling encryption errors and special cases.

If encryption.encrypt or encryption.decrypt throws an error or returns unexpected values, it could break the ORM logic. It might be useful to wrap these calls in try/catch or validate the input/output to ensure robust error handling for malicious or corrupted data.

server/distBE/eval/evaluate.js (2)

6-14: Provide more robust CLI argument parsing.

The current checks ensure basic presence of arguments, but do not handle potential mismatched or partial arguments (e.g., missing parameter after --output). Consider using a dedicated argument parser or safer indexing to avoid out-of-bounds array access.


36-36: Recommend parameterizing concurrency and queue usage.

Currently, concurrency is set to 10. For large or small datasets, dynamically adjusting concurrency (e.g., via an environment variable) can optimize performance. Also, confirm the queue completes successfully before writing results (line 65 is correct usage, but remain aware of any pending asynchronous tasks).

Also applies to: 64-65

server/distBE/ai/provider/ollama.js (2)

14-16: Clarify error handling and response logic in converse.

A single generic error message ("Failed to get response from Ollama") may make debugging difficult. Consider logging the exact error content or message for diagnosis while preserving security best practices.

Also applies to: 31-39


64-70: Ensure meaningful streaming structures for converseStream.

The chunked stream approach is valid, but if partial errors occur mid-stream, the logic might terminate unexpectedly. Consider more granular error recovery or a fallback message for partial responses to improve resilience.

Also applies to: 72-75

server/distBE/logger/index.js (1)

1-2: Remove unused imports.

The levels import from "pino" and the imported Subsystem from "../types.js" are not referenced, which is dead code that can be removed to keep the module clean.

-import { levels, pino } from "pino";
+import { pino } from "pino";
-import { Subsystem } from "../types.js"
server/distBE/db/seed.js (3)

15-25: Warning: Repeated usage of @ts-ignore.

You’re silencing TypeScript errors when inserting data. Consider extending or refining your schema definitions in workspaces so that manual ignores become unnecessary and type safety is preserved.


39-39: Fix string interpolation for log messages.

Currently, there's a stray quote after "Inserted User:". This can create confusing logs. Consider passing the user object as a separate parameter or using a cleaner string format:

- Logger.info(`Inserted User:', ${user}`);
+ Logger.info(`Inserted User: ${JSON.stringify(user)}`);

48-49: Avoid forcibly exiting after the async call completes.

process.exit(0) might terminate the process prematurely if there are still pending tasks or open connections. Ensure all resources (e.g., DB connections) are safely closed before ending the process, or let Node exit naturally.

Would you like me to propose a solution that gracefully closes pending connections and exits once all tasks are complete?

server/distBE/search/mappers.js (3)

36-37: Remove redundant property assignments.

Lines such as child.fields.chunks_summary = child.fields.chunks_summary; have no effect. Consider removing these no-op lines to keep the code more concise and readable.

Also applies to: 50-51, 60-61


25-78: Reduce repeated logic for setting type and relevance.

Assigning child.fields.type and child.fields.relevance in each conditional block is repetitive. A small helper function could streamline this code by setting shared properties before applying the specific schema.


79-138: Consider unifying search and autocomplete logic.

VespaSearchResponseToSearchResult and VespaAutocompleteResponseToResult share similar patterns of type assignment and parsing. Extracting common patterns into a shared utility could reduce duplication and ease maintenance.

server/distBE/api/oauth.js (4)

20-23: Consider returning more detailed error information.

Throwing an HTTP 500 response on missing state may not be very descriptive. It can be useful to log or return a clear message indicating the missing query parameters.


31-32: Improve error handling to distinguish invalid or expired cookie.

Currently, the code throws a 500 status if random does not match stateInCookie. You might consider returning a 403 or a 401 to reflect an invalid session or a potential CSRF scenario.


37-39: Handle non-GoogleDrive apps without code verifier more gracefully.

If the code verifier is missing, the logic only checks for app === Apps.GoogleDrive. Consider anticipating future OAuth expansions by handling this scenario elegantly for all providers.


67-73: Evaluate transactional consistency.

If the queue job is created but the connector update fails (or vice versa), you risk inconsistent states. Wrapping these calls inside a transactional flow (if supported) could help maintain data integrity.

server/distBE/db/oauthProvider.js (3)

8-23: Wrap insertion in a transaction, if necessary.

While you are catching insertion errors, consider ensuring a more comprehensive transactional approach if other related database operations occur in tandem. This ensures consistency if one operation fails.


24-36: Unify error messages.

"Could not get the connector" suggests a missing resource, but the function name is getOAuthProvider. For clarity, you might want to unify error messages to reflect the entity type (“provider” vs “connector”).


37-49: Return a single provider object or handle array consistently.

getOAuthProviderByConnectorId returns an array in res, but only res[0] is relevant. Consider returning a single object for consistency, or clarify any scenario requiring multiple providers.

server/distBE/eval/processDataForVespa.js (4)

6-8: Validate arguments more robustly.

You only check for the overall length of the arguments array. Consider verifying each argument for correctness (e.g., ensure the next token isn’t missing) in case users provide malformed flags or no values.


27-28: Import statements can be consolidated.

You are importing fs and chunkDocument separately after runtime argument checks. Consider grouping your imports at the top to streamline the code structure.


60-62: Consider concurrency or streaming approach for large data.

Writing large batches to one file works, but you might want to parallelize or pipeline the process if input size grows significantly. This can reduce memory usage on very large files.


78-82: Monitor partial writes for reliability or duplication.

If an unexpected error occurs after some writes, it may be unclear which documents were successfully committed. You could store a checkpoint or track progress to avoid partially processed data.

server/distBE/chunks.js (1)

16-17: Consider replacing illegal characters with a placeholder.

Currently, illegal code points are removed, which may unintentionally merge words. Replacing them with a placeholder (e.g. a space) could improve readability.

server/distBE/db/message.js (1)

45-48: Refine the error message to accurately reflect the missing entity.

Throwing "Chat not found" here might be misleading. Consider changing it to "Message not found" for clarity.

- throw new Error("Chat not found");
+ throw new Error("Message not found");
server/distBE/ai/provider/bedrock.js (2)

1-10: Consider verifying AWS configurations and error handling for Bedrock client.
The code imports the Bedrock Runtime client but does not appear to validate region or credential configuration. To prevent runtime errors, ensure that your environment or config file consistently provides valid AWS settings.


48-129: Consider splitting out chunk-handling logic in converseStream.
The streaming logic is wrapped into a single large method, which may become harder to maintain. Extracting chunk processing into a helper function can reduce complexity and improve readability.

server/distBE/server.js (3)

81-95: Refine AuthRedirect error handling.
The AuthRedirect middleware warns and redirects when no auth token is found or the token is invalid. Consider distinguishing these scenarios in the UI or logs for clearer debugging and user experience.


100-125: Check route definitions for naming consistency.
API endpoints such as /chat, /chat/bookmark, etc. are logically named, though you might consider a consistent pluralization or hierarchical pattern for clarity (e.g., /chats/:id/bookmark).


149-167: Evaluate stricter validation for JWT payloads.
When generating tokens in generateToken, consider adding audience and issuer claims or verifying scopes to enhance security. This can include input validation on email, role, and workspaceId.

server/distBE/integrations/google/gmail/quote-parser.js (1)

67-155: Consider performance testing parseEmailBody.
Parsing and reversing each line can become expensive for very large emails. If performance becomes a concern, you might streamline or chunk the processing to avoid memory overhead.

server/distBE/integrations/google/pdf-utils.js (3)

10-27: Consider handling the case when the file does not exist.
If the file is already missing (e.g., ENOENT), the current logic will throw an error. You could optionally check for err.code === "ENOENT" and log a warning instead of throwing, if that case is non-critical.

} catch (err) {
+  if (err.code === "ENOENT") {
+    Logger.warn(`File not found at ${filePath}, no action needed`);
+    return;
+  }
   Logger.error(
     err,

28-49: Consider cleaning up partially written files on errors.
In the event of a write error, you might end up with a partially written file. Deleting the partial file helps ensure data consistency and reduces disk clutter.


50-68: Enhance password-protected PDF handling.
Currently, password-protected PDFs are skipped altogether. This may be acceptable, but if you need partial data extraction or user prompts for passwords, consider implementing a fallback method or a prompt-based approach.

server/distBE/ai/provider/openai.js (2)

11-41: Explicit error handling for network or API failures is missing.
Consider wrapping the this.client.chat.completions.create call in a try...catch block to handle potential API or network errors.

async converse(messages, params) {
  const modelParams = this.getModelParams(params);
- const chatCompletion = await this.client.chat.completions.create({
+ let chatCompletion;
+ try {
+   chatCompletion = await this.client.chat.completions.create({
      // ...
+   });
+ } catch (apiError) {
+   // Graceful error logging or rethrow
+   throw apiError;
+ }
  // ...
}

42-62: Ensure documentation for streaming parameters.
It would be beneficial to document how stream_options (line 58) impacts the response structure, especially if other maintainers need to customize streaming behavior in the future.

server/distBE/utils/retry-async.js (2)

1-10: Default retry limit may cause infinite loops.
With retry set to Number.POSITIVE_INFINITY, failures can loop indefinitely. Consider using a finite default or an exponential backoff strategy to avoid potential starvation.


11-21: Refactor to support jitter or exponential backoff for robust retry.
Adding a random or exponential delay helps mitigate load spikes and improves reliability under frequent failures.

server/distBE/ai/provider/together.js (1)

38-39: Preserve original error details for debugging.

Re-throwing a generic error hides the original message and stack trace. Consider appending the original error to help diagnose issues:

} catch (error) {
-  throw new Error("Failed to get response from Together");
+  throw new Error(`Failed to get response from Together: ${error.message}`);
}
server/distBE/integrations/google/config.js (1)

17-21: Consider making concurrency values configurable.

Storing concurrency values (2, 3, etc.) in environment variables or a separate config file could make them easier to adjust without code changes. This helps with scaling and fine-tuning performance in different environments.

server/distBE/api/admin.js (3)

18-28: Validate error conditions more thoroughly.

In GetConnectors, you throw NoUserFound when no user is retrieved, which is correct. However, consider including more contextual data (e.g., the email associated with the request) in the error message to aid in debugging.


59-78: Ensure correct query parameter validation.

Line 68 uses @ts-ignore when reading app from c.req.valid("query"). This could mask potential runtime issues if app is missing or typed incorrectly. Consider refining the typing or providing fallback logic for missing or invalid query parameters.


38-41: Include original error details when re-throwing.

Line 39 re-throws a new NoUserFound error with empty curly braces. Adding the original error context (e.g., reason or stack trace) could be beneficial for debugging.

server/distBE/integrations/google/sync.js (5)

66-80: Clarify spreadsheet fallback logic.
When getDocumentOrNull for docId returns null, this function checks only docId_0 as a fallback. If spreadsheets can have indexes beyond _0, it might miss additional sections.


356-670: Function is too large: break into smaller processes.
handleGoogleOAuthChanges processes Drive, Contacts, and Calendar updates in one go. Splitting these responsibilities into separate functions (e.g., syncDriveChanges, syncContacts, syncCalendarEvents) would enhance maintainability.


671-717: Ensure partial failures are handled.
insertEventIntoVespa logs errors and continues, which could cause silent data corruption if some fields fail to insert. Consider returning error information to the caller or storing partial failures for later diagnosis.


718-864: Check large event merges.
handleGoogleCalendarEventsChanges merges numerous events in a loop. If many events are updated simultaneously, performance might degrade. Consider batching or compressing updates in larger sets to reduce overhead.


998-1034: Detect duplicate contact records.
syncContacts inserts or deletes contacts. If multiple sync runs overlap or a contact reappears, duplicates may occur. Adding a deduplication check (e.g., by resourceName) is recommended.

server/distBE/integrations/google/gmail-worker.js (2)

73-146: Ingestion approach is clear but watch out for forwarded duplicates.
handleGmailIngestion processes all messages in multiple pages. Consider skipping duplicates if a message reappears. For large mailboxes, also confirm that memory usage from parallel operations is acceptable.


147-170: Regex approach for parsing email addresses.
extractEmailAddresses uses a simple regex for angle-bracketed addresses. This works in most cases, but some email formats break it (e.g., quotes or unusual formatting). Comprehensive email parsing libraries can reduce edge-case risk.

server/distBE/integrations/google/gmail/index.js (2)

85-108: Edge cases in email extraction.
extractEmailAddresses must handle commas inside quotes or nested angle brackets. If advanced email address formats are encountered (e.g., “Doe, John [email protected]”), the function might produce partial or invalid addresses.


233-262: Consider partial HTML to text conversions.
getBody attempts to convert entire HTML parts with htmlToText. If an email body is extremely large or malformed HTML, partial or incremental conversion might reduce parse overhead.

server/distBE/ai/provider/fireworksClient.js (2)

46-70: Consider adding jitter to your retry logic.

In complete, you're using exponential backoff with Math.pow(2, retries) * 1000. Adding a small randomized jitter (similar to the approach in your other code) would help reduce the risk of synchronized retries across multiple instances.

- await new Promise((resolve) => setTimeout(resolve, Math.pow(2, retries) * 1000));
+ const baseDelay = Math.pow(2, retries) * 1000;
+ const jitter = Math.floor(Math.random() * 500);
+ await new Promise((resolve) => setTimeout(resolve, baseDelay + jitter));

127-153: Remove or mask sensitive keys in examples.

In the example() function, ensure you do not accidentally commit real API keys or credentials. Replacing "your-api-key" with environment variables or secure placeholders prevents accidental credential leaks.

server/distBE/utils.js (1)

29-40: Consider supporting more progress statuses.

progress_callback currently ignores any status besides "progress". If future progress types or statuses are introduced (e.g., "queued", "done"), consider handling them explicitly or logging them for debugging.

server/distBE/ai/context.js (2)

64-90: Event context includes robust fields.

constructEventContext thoroughly enumerates event fields and times. Minor note: a fallback for invalid or missing dates might be warranted. If startTime or endTime is not a valid Date, the .toUTCString() call could produce Invalid Date.


138-163: Colored context repetition.

constructFileColoredContext, constructUserColoredContext, and constructMailColoredContext largely mirror the non-colored versions. If you anticipate significant changes, consider an abstraction that just colorizes the relevant fields to reduce duplication.

server/distBE/types.js (1)

183-185: Naming mismatch in OperationStatus enum.
Using "Pendings" as the enum key but assigning it the value "Pending" may confuse future maintainers.

- OperationStatus["Pendings"] = "Pending";
+ OperationStatus["Pending"] = "Pending";
server/distBE/db/connector.js (3)

21-24: Avoid variable shadowing for config.
The function parameter config conflicts with the imported config at line 1. Rename one of them to prevent confusion.

-export const insertConnector = async (trx, workspaceId, userId, workspaceExternalId, name, type, 
-  authType, app, config, credentials, subject, oauthCredentials, status) => {
+export const insertConnector = async (trx, workspaceId, userId, workspaceExternalId, name, type, 
+  authType, app, connectorConfig, credentials, subject, oauthCredentials, status) => {
     ...
}

93-103: Handle missing or malformed accessTokenExpiresAt.
IsTokenExpired assumes token.accessTokenExpiresAt is valid. If it's absent or invalid, this function may yield false positives or crash.

+  if (!tokens.accessTokenExpiresAt) {
+    Logger.warn("Missing 'accessTokenExpiresAt' field, defaulting to expired token logic");
+    return true;
+  }

125-153: Potential concurrency concern in token refresh.
Multiple parallel requests could simultaneously detect an expired token and attempt refresh. Consider implementing a lock or short-lived in-memory marker to prevent repeated refresh attempts.

server/distBE/queue/index.js (1)

71-73: Duplicate scheduling call may lead to confusion.
setupServiceAccountCronjobs() is also called on line 73, meaning some schedules might overlap. If intentional, consider clarifying the approach.

server/distBE/db/user.js (5)

1-2: Remove unused import [db].
It appears db is imported but never utilized within this file, which adds unnecessary overhead.

- import { db } from "./client.js"

32-48: Consider consistent schema validation.
Unlike getPublicUserAndWorkspaceByEmail, this function does not parse the returned data with schemas. If schema validation is also needed here, consider applying the same approach for consistency.


49-59: Handle absent user/workspace scenario.
The function returns a result array but does not throw an error if it's empty. For consistency with other methods, consider throwing a 404 or returning null if no record is found.


61-67: Add not-found handling and schema validation.
Currently, it returns an array that might be empty. Consider throwing an error or returning a specific response if no user is found, and optionally parse the data with the schema if needed.


68-88: Validate inserted user.
After inserting the user, the code returns the raw database result. For consistency and error handling, consider parsing the returned row with selectUserSchema and possibly verifying uniqueness constraints or handling collisions.

server/distBE/db/chat.js (6)

21-28: Check pagination or performance for large chat lists.
getWorkspaceChats retrieves all chats in descending order by updatedAt. For large workspaces, consider adding pagination or limiting results to optimize performance.


36-45: Code style consistency.
This function is nearly identical to getChatById. Both throw “Chat not found.” Just ensure it’s intentional to keep them separate or consider DRY approaches if they share logic.


46-57: Include schema validation after update.
After updating, consider validating the returned row with selectChatSchema to ensure data integrity, similar to how new or fetched records are validated.


58-67: Soft-delete vs. permanent deletion.
Here you permanently remove a record. If you use soft-deletes elsewhere (e.g., deletedAt), consider a consistent approach.


68-77: Better error message for missing messages.
Throwing “Messages not found” might be ambiguous if some but not all messages exist. If partial deletion is possible, consider clarifying or returning how many records were actually deleted.


90-99: Limit or paginate public chats.
getPublicChats offers pagination arguments (pageSize, offset), which is great. Verify correctness of those parameters, including potential edge cases if pageSize or offset are undefined or zero.

server/distBE/ai/types.js (7)

1-3: Unused import from AWS SDK.
No references to @aws-sdk/client-bedrock-runtime appear below. If you’re planning to integrate with Bedrock in future code, that’s fine; otherwise, consider removing it to keep dependencies minimal.


13-35: Potential version confusion in model names.
Model identifiers like "meta.llama3-1-70b-instruct-v1:0" might lead to confusion if updated frequently. Document changes or handle versioning in an external config if these evolve often.


36-42: QueryCategory usage.
You have categories like Self, InternalPerson, etc. If logic differs significantly between categories, consider adding clarifying docs or typed structures for each.


44-49: Expanded query types.
Future expansions (e.g., “RetrieveMetadata”) are commented out. Confirm if these will be reintroduced or if they should be removed for clarity.


60-65: SearchAnswerResponse with partial data.
Your code allows answer and citations to be nullable. Ensure your UI or consumer code handles these nulls gracefully.


73-80: Default handling for count.
You preprocess count with a default of 5. Verify that count cannot become negative or an excessively large number that might degrade performance.


98-101: Context rank expansions.
QueryContextRank enforces a boolean and an array. If more fields become necessary (like confidence scores), keep the structure flexible or documented for expansions.

server/distBE/api/search.js (5)

1-45: Well-organized imports, but consider reorganizing for better clarity.

The import section is quite large and mixes different types of imports (utilities, database operations, AI functions, etc.). Consider organizing imports by category or functionality for better readability.

// AI-related imports
import {
  analyzeQueryForNamesAndEmails,
  analyzeQueryMetadata,
  askQuestion,
} from "../ai/provider/index.js"
import {
  answerContextMap,
  answerMetadataContextMap,
  cleanContext,
  userContext,
} from "../ai/context.js"
import { QueryCategory } from "../ai/types.js"

// Tokenizers
import llama3Tokenizer from "llama3-tokenizer-js"
import { encode } from "gpt-tokenizer"

// Search operations
import {
  autocomplete,
  deduplicateAutocomplete,
  groupVespaSearch,
  searchVespa,
  searchUsersByNamesAndEmails,
  getTimestamp,
  insert,
  GetDocument,
  UpdateDocument,
  updateUserQueryHistory,
} from "../search/vespa.js"
import {
  VespaAutocompleteResponseToResult,
  VespaSearchResponseToSearchResult,
} from "../search/mappers.js"
import { userQuerySchema, userSchema } from "../search/types.js"

// Database operations
import { getPublicUserAndWorkspaceByEmail } from "../db/user.js"
import { db } from "../db/client.js"

// Utilities and configuration
import { z } from "zod"
import config from "../config.js"
import { HTTPException } from "hono/http-exception"
import { AnswerSSEvents } from "../shared/types.js"
import { streamSSE } from "hono/streaming"
import { getLogger } from "../logger/index.js"
import { Subsystem } from "../types.js"
import { getErrorMessage } from "../utils.js"

92-95: Consider handling empty results more gracefully.

When autocomplete returns no results, you're returning an object with an empty children array. It would be better to return a consistent structure that matches what VespaAutocompleteResponseToResult would return.

    let results = await autocomplete(query, email, 5)
    if (!results) {
-      return c.json({ children: [] })
+      return c.json({ results: [] })
    }

99-105: Good error handling, but could be more specific.

The error handling is good, but it returns a generic error message regardless of the type of error. Consider providing more specific error messages based on the error type.

  } catch (error) {
    const errMsg = getErrorMessage(error)
    Logger.error(error, `Autocomplete Error: ${errMsg} ${error.stack}`)
    throw new HTTPException(500, {
-      message: "Could not fetch autocomplete results",
+      message: error instanceof HTTPException 
+        ? errMsg 
+        : "Could not fetch autocomplete results",
    })
  }

158-158: Address the TODO comment in the code.

The TODO comment indicates a need to deduplicate results for Google admin and contacts. Consider implementing this functionality or creating a ticket to track this work.

I can help implement the deduplication logic for Google admin and contacts. Would you like me to create a detailed implementation or open a new issue to track this work?


166-167: Remove TypeScript ignore comments and fix type issues properly.

Similar to the previous case, the @ts-ignore comment is used to bypass TypeScript type checking. Fix this properly with appropriate type annotations.

-  // @ts-ignore
-  const { query, app, entity } = c.req.valid("query")
+  const { query, app, entity } = c.req.valid("query") as { 
+    query: string, 
+    app?: string, 
+    entity?: string 
+  }
server/distBE/integrations/google/tracking.js (7)

2-9: Consider using TypeScript enums instead of JavaScript IIFE for better type safety.

You're using an immediately-invoked function expression (IIFE) to define an enum-like object. Modern TypeScript has better enum syntax that provides additional type safety.

-export var StatType;
-(function (StatType) {
-    StatType["Gmail"] = "gmailCount";
-    StatType["Drive"] = "driveCount";
-    StatType["Contacts"] = "contactsCount";
-    StatType["Events"] = "eventsCount";
-    StatType["Mail_Attachments"] = "mailAttachmentCount";
-})(StatType || (StatType = {}));
+export enum StatType {
+    Gmail = "gmailCount",
+    Drive = "driveCount",
+    Contacts = "contactsCount",
+    Events = "eventsCount",
+    Mail_Attachments = "mailAttachmentCount"
+}

11-19: Consider adding TypeScript interfaces for tracker objects.

The tracker objects would benefit from proper TypeScript interfaces to define their structure and ensure type safety throughout the codebase.

+interface UserStats {
+  gmailCount: number;
+  driveCount: number;
+  contactsCount: number;
+  eventsCount: number;
+  mailAttachmentCount: number;
+  done: boolean;
+  startedAt: number;
+  doneAt: number;
+  type: AuthType;
+  totalMail: number;
+  totalDrive: number;
+}
+
+interface ServiceAccountTracker {
+  totalUsers: number;
+  completedUsers: number;
+  userStats: Record<string, UserStats>;
+}
+
+interface OAuthTracker {
+  user: string;
+  userStats: Record<string, UserStats>;
+}

// Global tracker object
-export const serviceAccountTracker = {
+export const serviceAccountTracker: ServiceAccountTracker = {
  totalUsers: 0,
  completedUsers: 0,
  userStats: {},
};
-export const oAuthTracker = {
+export const oAuthTracker: OAuthTracker = {
  user: "",
  userStats: {},
};

21-52: Add email parameter type and consider handling existing stats differently.

The function doesn't specify the type for the email parameter, and it doesn't handle the case where stats exist but may need to be reset.

-const initializeUserStats = (email) => {
+/**
+ * Initialize user statistics for both service account and OAuth trackers
+ * @param {string} email The user's email address
+ */
+const initializeUserStats = (email: string) => {
  if (!serviceAccountTracker.userStats[email]) {
    serviceAccountTracker.userStats[email] = {
      gmailCount: 0,
      driveCount: 0,
      contactsCount: 0,
      eventsCount: 0,
      mailAttachmentCount: 0,
      done: false,
      startedAt: new Date().getTime(),
      doneAt: 0,
      type: AuthType.ServiceAccount,
      totalMail: 0,
      totalDrive: 0,
    };
  }
  if (!oAuthTracker.userStats[email]) {
    oAuthTracker.userStats[email] = {
      gmailCount: 0,
      driveCount: 0,
      contactsCount: 0,
      eventsCount: 0,
      mailAttachmentCount: 0,
      done: false,
      startedAt: new Date().getTime(),
      doneAt: 0,
      type: AuthType.OAuth,
      totalMail: 0,
      totalDrive: 0,
    };
  }
};

53-57: Add type annotations and validate inputs.

The updateUserStats function should have type annotations for its parameters and should validate inputs before using them.

-export const updateUserStats = (email, type, count) => {
+/**
+ * Update user statistics for a specific statistic type
+ * @param {string} email The user's email address
+ * @param {StatType} type The statistic type to update
+ * @param {number} count The count to add to the statistic
+ */
+export const updateUserStats = (email: string, type: StatType, count: number) => {
+  if (!email || typeof email !== 'string') {
+    throw new Error('Email is required and must be a string');
+  }
+  if (!Object.values(StatType).includes(type)) {
+    throw new Error(`Invalid stat type: ${type}`);
+  }
+  if (typeof count !== 'number' || isNaN(count)) {
+    throw new Error('Count must be a valid number');
+  }
  initializeUserStats(email);
  serviceAccountTracker.userStats[email][type] += count;
  oAuthTracker.userStats[email][type] += count;
};

58-64: Add type annotations and use constants for magic values.

The updateTotal function should have type annotations and should use constants for the totalMail and totalDrive fields.

+/**
+ * Property names for the total counts
+ */
+export enum TotalCountType {
+  Mail = "totalMail",
+  Drive = "totalDrive"
+}

-export const updateTotal = (email, totalMail, totalDrive) => {
+/**
+ * Update total mail and drive counts for a user
+ * @param {string} email The user's email address
+ * @param {number} totalMail The total mail count
+ * @param {number} totalDrive The total drive count
+ */
+export const updateTotal = (email: string, totalMail: number, totalDrive: number) => {
+  if (!email || typeof email !== 'string') {
+    throw new Error('Email is required and must be a string');
+  }
+  if (typeof totalMail !== 'number' || isNaN(totalMail)) {
+    throw new Error('Total mail must be a valid number');
+  }
+  if (typeof totalDrive !== 'number' || isNaN(totalDrive)) {
+    throw new Error('Total drive must be a valid number');
+  }
  initializeUserStats(email);
-  serviceAccountTracker.userStats[email].totalMail = totalMail;
-  serviceAccountTracker.userStats[email].totalDrive = totalDrive;
-  oAuthTracker.userStats[email].totalMail = totalMail;
-  oAuthTracker.userStats[email].totalDrive = totalDrive;
+  serviceAccountTracker.userStats[email][TotalCountType.Mail] = totalMail;
+  serviceAccountTracker.userStats[email][TotalCountType.Drive] = totalDrive;
+  oAuthTracker.userStats[email][TotalCountType.Mail] = totalMail;
+  oAuthTracker.userStats[email][TotalCountType.Drive] = totalDrive;
};

65-71: Add validation and add completion timestamp handling.

The markUserComplete function should check if the user stats exist before trying to mark them as complete, and should consider edge cases like re-marking a user as complete.

-export const markUserComplete = (email) => {
+/**
+ * Mark a user as complete in the service account tracker
+ * @param {string} email The user's email address
+ */
+export const markUserComplete = (email: string) => {
+  if (!email || typeof email !== 'string') {
+    throw new Error('Email is required and must be a string');
+  }
+  
+  // Initialize if needed
+  initializeUserStats(email);
+  
  if (!serviceAccountTracker.userStats[email].done) {
    serviceAccountTracker.userStats[email].done = true;
    serviceAccountTracker.userStats[email].doneAt = new Date().getTime();
    serviceAccountTracker.completedUsers++;
  }
};

79-81: Add type annotation for the email parameter.

The setOAuthUser function should have a type annotation for the email parameter.

-export const setOAuthUser = (mail) => {
+/**
+ * Set the email of the OAuth user being tracked
+ * @param {string} mail The user's email address
+ */
+export const setOAuthUser = (mail: string) => {
+  if (!mail || typeof mail !== 'string') {
+    throw new Error('Email is required and must be a string');
+  }
  oAuthTracker.user = mail;
};
server/distBE/ai/eval.js (4)

608-609: Uncomment or remove the commented return statement.

There's a commented line // return answer which suggests that the function should return a value but currently doesn't. Either uncomment this line or remove it if not needed.

-  // return answer
+  return answer || "I don't know";

619-623: Remove commented code blocks.

There's a commented-out code block that appears to be unused. Either implement the functionality or remove the commented code to keep the codebase clean.

-  // const formattedMessages = messages ?
-  // messages.map((msg, index) => ({
-  //   role: "user",
-  //   content: [{ text: (msg as Data).input }]
-  // })) : [];

631-631: Implement proper cost tracking instead of hardcoded values.

The function uses a hardcoded cost value of 0.001, which doesn't accurately reflect the actual cost of the operation. Consider implementing proper cost tracking.

-                costArr: [0.001],
+                costArr: [0.001], // TODO: Implement proper cost tracking based on actual API usage

636-636: Uncomment or remove the commented function call.

There's a commented function call // endToEndFactual() at the end of the file. Either uncomment it if it's meant to be executed or remove it if it's not needed.

-// endToEndFactual()
+// Uncomment to run the evaluation
+// endToEndFactual()
+
+// Export the evaluation functions for external use
+export { 
+  basicJsonRag, 
+  basicRag, 
+  iterativeJsonRag, 
+  iterativeTimeExpansionJsonRag,
+  iterativeWithTimeFilterAndQueryRewrite,
+  pointEventQueryTimeExpansion,
+  endToEndFactual
+};
server/distBE/integrations/google/utils.js (2)

14-15: Consider removing unused import MAX_ATTACHMENT_PDF_SIZE.

MAX_ATTACHMENT_PDF_SIZE is imported but never used in this file. Removing unused imports can help keep the code clean.

-  MAX_ATTACHMENT_PDF_SIZE,
   MAX_GD_PDF_SIZE,

75-97: Re-examine the decision to return null instead of throwing or propagating errors.

While returning null ensures the sync engine continues, it can make debugging more difficult if callers disallow nulls or expect an error. Consider throwing a custom error or returning a defined error object so that upstream processes can differentiate between genuine "no file" scenarios and transient failures.

server/distBE/api/chat.js (4)

103-124: Refine error messages for debugging clarity.

GetChatApi returns a generic "Could not fetch chat and messages" message in case of failure. Providing additional context or debug logs (e.g., chatId) can help with troubleshooting.


192-241: Ensure robust error handling for unexpected entity types.

searchToCitation throws if it encounters an unknown sddocname. If the system can encounter new or unrecognized doc types, consider logging or default handling instead of throwing.


693-775: Maintain consistent error handling in MessageApi.

The function gracefully handles several error scenarios, but a thrown synchronous HTTPException might bypass the SSE streaming flow. Ensure that all potential errors are handled uniformly (e.g., streaming an SSE "Error" event).


952-1001: Consider logging newly inserted messages for auditing.

When an answer is successfully created and stored, logging the msg.externalId or a truncated portion of the message can help trace user interactions.

server/distBE/shared/types.js (2)

60-66: Extend OpenAIError handling for other scenarios.

Currently, only RateLimitError and InvalidAPIKey are defined. If the system can handle other error types from OpenAI (e.g., network issues), consider expanding the enum or implementing a catch-all.


203-213: Confirm usage of ChatSSEvents.Error vs. other error channels.

Multiple modules define error or SSE events. Consistency in naming and usage across ChatSSEvents.Error or newly added events for different error subcategories helps maintain clarity.

server/distBE/db/schema.js (1)

110-110: Unresolved TODO
A comment indicates additional fields (such as token expiration fields) are planned. Ensure these are tracked and implemented if needed for OAuth connectors.

Would you like help adding these fields and preparing a migration strategy?

server/distBE/search/vespaClient.js (1)

87-105: Consolidate duplicate insert logic
Both insertDocument and insert methods perform similar operations. Consider merging or refactoring them into a single, configurable method to reduce code duplication and simplify maintenance.

server/distBE/search/vespa.js (2)

23-23: Hardcoded NAMESPACE
const NAMESPACE = "namespace"; // Replace with your actual namespace may cause confusion in production if forgotten. Consider loading from config or environment variables to avoid accidental usage of a placeholder.


540-540: Typographical error in function name
vespa.getUsersByNamesAndEmaisl has a misspelling ("Emaisl" vs. "Emails"). Consider correcting for consistency.

server/distBE/ai/provider/index.js (4)

37-37: Typographical error in variable name
It looks like "togetherProvidder" has a double "d" in its name. This might lead to confusion or inadvertent errors.

- let togetherProvidder = null;
+ let togetherProvider = null;

40-88: Consider concurrency safety and repeated initialization
The initializeProviders() function sets a flag (providersInitialized) to prevent multiple initializations, but if it is called in parallel before the flag is set, multiple providers could still be created. Consider adding synchronization or clarifying initialization order to ensure thread safety in a concurrent environment.


146-169: Lack of fallback or retry mechanism
The askQuestion function attempts to stream a conversation and throws on any error without retries or fallback. In certain error scenarios (e.g., transient network failures), a simple retry could improve resilience. Consider adding a retry strategy or fallback for robust error handling.


597-626: Check for consistent naming or usage
In the baselineRAGJson function, the response property is called "output" whereas other similarly structured functions (e.g., askQuestionWithCitations) name it differently. For consistency and maintainability, consider standardizing the JSON response structure across these streaming or analysis functions.

server/distBE/ai/prompts.js (1)

180-194: Ambiguity with rewriting personal queries
The instructions in rewriteQuerySystemPrompt mention replacing first-person pronouns with the user’s name or email. Verify that it is desired behavior to potentially expose an internal user name or email in logs or queries for all contexts. This might inadvertently reveal personal information.

server/distBE/errors/index.js (1)

2-5: Inconsistent naming for enum entries
DbOp["Create"] uses PascalCase for the enum string, while DbOp["READ"] uses uppercase. This inconsistency could create confusion or bugs. Consider standardizing the enum value naming (e.g., Create, Read, Update, Delete).

- DbOp["Create"] = "Create";
- DbOp["READ"] = "Read";
+ DbOp["Create"] = "Create";
+ DbOp["Read"] = "Read";
server/distBE/integrations/google/index.js (6)

112-140: Handle errors for partial retrievals more seamlessly.

Within listUsersByEmails(), an error during user retrieval logs a warning and continues iterating, which is great for graceful handling. However, consider an optional aggregated error report for skipped users. This helps in post-retrieval analysis of any recurring issues (e.g. misconfiguration, invalid emails).

 for (const email of emails) {
   try {
     const res = await retryWithBackoff(() =>
       admin.users.get({ userKey: email }),
-    )
+    );
     users.push(res.data);
   } catch (error) {
     Logger.warn(`User ${email} not found: ${error}`);
+    // Optional: keep track of failed lookups
+    // skippedEmails.push({ email, errorMsg: error.toString() });
   }
 }

173-257: Validate concurrency or error handling for multiple sync jobs.

The syncGoogleWorkspace() method assumes there's exactly one sync job in the database. If your usage grows, or if a second job is triggered accidentally, it throws an error and fails the job. Ensure that the logic for concurrency and job scheduling is well-documented or well-tested to manage potential collisions or overlapping sync tasks as adoption scales.


522-658: Ensure OAuth token refresh logic is robust.

handleGoogleOAuthIngestion() sets the credentials once, but if the job runs longer than the token’s lifetime, the token may expire. Consider proactively refreshing or verifying the token’s validity mid-flow to avoid partial ingestions and re-tries.


714-898: Improve partitioning logic for large user sets.

handleGoogleServiceAccountIngestion() concurrently processes each user in an entire workspace. If a workspace has thousands of users, it might still be heavy for upstream resources. Consider partitioning the users or using dynamic concurrency limits based on resource usage to optimize throughput and avoid potential throttling by Google APIs.


1453-1528: Investigate possible memory overhead on PDF ingestion.

googlePDFsVespa() downloads and processes PDFs concurrently with p-limit. Large or numerous PDFs may still stress memory if processed in parallel. Consider logging memory usage or restricting concurrency further if ephemeral memory spikes are observed in production.


1749-1780: Refine or parameterize file query filters.

listFiles() uses a simple q: "trashed = false" to list non-trashed files. Consider parameterizing the search query to exclude large or irrelevant files (e.g., AI Studio folders, archived items, app profiles) if known up front. This can save bandwidth and processing time.

server/distBE/search/types.js (3)

208-213: Correct typographical error in "EventAtatchment."

The interface is named EventAtatchment but references “attachment.” Consider renaming to maintain consistency:

-const EventAtatchment = z.object({
+const EventAttachment = z.object({
   fileId: z.string(),
   title: z.string(),
   fileUrl: z.string(),
   mimeType: z.string(),
});

2-5: Clarify placeholder schema usage.

The exports fileSchema, userSchema, etc. are string placeholders. If these strings do not map to actual validation schemas, consider inlining Zod definitions or adding clarifying comments to prevent confusion in future maintenance.


38-38: Evaluate the completeness of enumerations.

The set of Apps is still partial (e.g., GoogleWorkspace, Notion, GoogleCalendar, etc.). If your application supports additional integrations or sub-apps, ensure Apps remains updated or adopt a more dynamic architecture if product expansions are anticipated.

server/distBE/integrations/google/ws.js (1)

1-2: Exporting WebSocket Connections Map Correctly

The new constant wsConnections is clearly exported and correctly initialized as a new Map. This design choice is well-suited for managing WebSocket connections in a key-value structure. Consider adding a brief JSDoc comment to explicitly document what keys and values are expected, which will help future contributors and maintainers understand its intended usage.

server/distBE/shared/config.js (1)

1-7: Environment Configuration Setup is Appropriate

This file properly loads environment variables via dotenv.config() and exports a default configuration object containing serverUrl. The use of a fallback value to "http://localhost:3000" ensures the application has a default behavior when the environment variable is missing. For improved clarity and future maintainability, you might consider adding inline documentation or JSDoc annotations describing the configuration parameters—especially if more configurations are expected to be added later.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8e2472d and 4d461a8.

⛔ Files ignored due to path filters (2)
  • package-lock.json is excluded by !**/package-lock.json
  • server/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (75)
  • package.json (1 hunks)
  • server/distBE/ai/contants.js (1 hunks)
  • server/distBE/ai/context.js (1 hunks)
  • server/distBE/ai/eval.js (1 hunks)
  • server/distBE/ai/mappers.js (1 hunks)
  • server/distBE/ai/prompts.js (1 hunks)
  • server/distBE/ai/provider/base.js (1 hunks)
  • server/distBE/ai/provider/bedrock.js (1 hunks)
  • server/distBE/ai/provider/fireworks.js (1 hunks)
  • server/distBE/ai/provider/fireworksClient.js (1 hunks)
  • server/distBE/ai/provider/gemini.js (1 hunks)
  • server/distBE/ai/provider/index.js (1 hunks)
  • server/distBE/ai/provider/ollama.js (1 hunks)
  • server/distBE/ai/provider/openai.js (1 hunks)
  • server/distBE/ai/provider/together.js (1 hunks)
  • server/distBE/ai/types.js (1 hunks)
  • server/distBE/api/admin.js (1 hunks)
  • server/distBE/api/auth.js (1 hunks)
  • server/distBE/api/chat.js (1 hunks)
  • server/distBE/api/oauth.js (1 hunks)
  • server/distBE/api/proxy.js (1 hunks)
  • server/distBE/api/search.js (1 hunks)
  • server/distBE/chunks.js (1 hunks)
  • server/distBE/cleanup.js (1 hunks)
  • server/distBE/config.js (1 hunks)
  • server/distBE/db/chat.js (1 hunks)
  • server/distBE/db/client.js (1 hunks)
  • server/distBE/db/connector.js (1 hunks)
  • server/distBE/db/customType.js (1 hunks)
  • server/distBE/db/message.js (1 hunks)
  • server/distBE/db/oauthProvider.js (1 hunks)
  • server/distBE/db/schema.js (1 hunks)
  • server/distBE/db/seed.js (1 hunks)
  • server/distBE/db/syncHistory.js (1 hunks)
  • server/distBE/db/syncJob.js (1 hunks)
  • server/distBE/db/user.js (1 hunks)
  • server/distBE/db/workspace.js (1 hunks)
  • server/distBE/doc.js (1 hunks)
  • server/distBE/drizzle.config.js (1 hunks)
  • server/distBE/errors/index.js (1 hunks)
  • server/distBE/eval/evaluate.js (1 hunks)
  • server/distBE/eval/processDataForVespa.js (1 hunks)
  • server/distBE/eval/qrelsToTrecFormat.js (1 hunks)
  • server/distBE/integrations/google/config.js (1 hunks)
  • server/distBE/integrations/google/gmail-worker.js (1 hunks)
  • server/distBE/integrations/google/gmail/index.js (1 hunks)
  • server/distBE/integrations/google/gmail/quote-parser.js (1 hunks)
  • server/distBE/integrations/google/index.js (1 hunks)
  • server/distBE/integrations/google/pdf-utils.js (1 hunks)
  • server/distBE/integrations/google/sync.js (1 hunks)
  • server/distBE/integrations/google/tracking.js (1 hunks)
  • server/distBE/integrations/google/utils.js (1 hunks)
  • server/distBE/integrations/google/worker-utils.js (1 hunks)
  • server/distBE/integrations/google/ws.js (1 hunks)
  • server/distBE/integrations/types/gmail.js (1 hunks)
  • server/distBE/logger/index.js (1 hunks)
  • server/distBE/mime.js (1 hunks)
  • server/distBE/pipeline.js (1 hunks)
  • server/distBE/queue/index.js (1 hunks)
  • server/distBE/search/mappers.js (1 hunks)
  • server/distBE/search/types.js (1 hunks)
  • server/distBE/search/vespa.js (1 hunks)
  • server/distBE/search/vespaClient.js (1 hunks)
  • server/distBE/server.js (1 hunks)
  • server/distBE/shared/config.js (1 hunks)
  • server/distBE/shared/types.js (1 hunks)
  • server/distBE/types.js (1 hunks)
  • server/distBE/utils.js (1 hunks)
  • server/distBE/utils/encryption.js (1 hunks)
  • server/distBE/utils/index.js (1 hunks)
  • server/distBE/utils/retry-async.js (1 hunks)
  • server/distBE/vespa/replaceDIMS.js (1 hunks)
  • server/package.json (3 hunks)
  • server/tests/fetchRetry.test.ts (1 hunks)
  • server/tsconfig.json (1 hunks)
🧰 Additional context used
🧬 Code Definitions (46)
server/distBE/search/mappers.js (2)
server/distBE/search/types.js (14)
  • fileSchema (2-2)
  • fileSchema (2-2)
  • userSchema (3-3)
  • userSchema (3-3)
  • mailSchema (4-4)
  • mailSchema (4-4)
  • eventSchema (5-5)
  • eventSchema (5-5)
  • mailAttachmentSchema (7-7)
  • mailAttachmentSchema (7-7)
  • MailAttachmentResponseSchema (489-504)
  • MailAttachmentResponseSchema (489-504)
  • userQuerySchema (6-6)
  • userQuerySchema (6-6)
server/distBE/shared/types.js (18)
  • FileResponseSchema (134-153)
  • FileResponseSchema (134-153)
  • UserResponseSchema (169-180)
  • UserResponseSchema (169-180)
  • EventResponseSchema (154-168)
  • EventResponseSchema (154-168)
  • AutocompleteFileSchema (61-69)
  • AutocompleteFileSchema (61-69)
  • AutocompleteUserSchema (70-81)
  • AutocompleteUserSchema (70-81)
  • AutocompleteMailSchema (90-101)
  • AutocompleteMailSchema (90-101)
  • AutocompleteEventSchema (112-121)
  • AutocompleteEventSchema (112-121)
  • AutocompleteUserQueryHSchema (82-89)
  • AutocompleteUserQueryHSchema (82-89)
  • AutocompleteMailAttachmentSchema (102-111)
  • AutocompleteMailAttachmentSchema (102-111)
server/distBE/db/client.js (2)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/queue/index.js (1)
  • url (19-19)
server/distBE/ai/provider/fireworks.js (5)
server/distBE/ai/provider/bedrock.js (1)
  • Logger (8-8)
server/distBE/ai/provider/gemini.js (1)
  • Logger (6-6)
server/distBE/ai/provider/ollama.js (1)
  • Logger (6-6)
server/distBE/ai/provider/together.js (1)
  • Logger (6-6)
server/distBE/ai/provider/fireworksClient.js (2)
  • messages (135-135)
  • response (142-142)
server/distBE/api/admin.js (7)
server/distBE/db/connector.js (3)
  • Logger (20-20)
  • insertConnector (21-55)
  • insertConnector (21-55)
server/distBE/db/oauthProvider.js (5)
  • Logger (7-7)
  • getOAuthProvider (24-36)
  • getOAuthProvider (24-36)
  • createOAuthProvider (8-23)
  • createOAuthProvider (8-23)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/db/user.js (2)
  • getUserByEmail (61-67)
  • getUserByEmail (61-67)
server/distBE/utils.js (3)
  • setCookieByEnv (43-57)
  • setCookieByEnv (43-57)
  • data (16-16)
server/distBE/types.js (2)
  • createOAuthProvider (52-57)
  • createOAuthProvider (52-57)
server/distBE/config.js (1)
  • serviceAccountWhitelistedEmails (91-91)
server/distBE/integrations/google/worker-utils.js (3)
server/distBE/integrations/google/pdf-utils.js (4)
  • error (57-57)
  • safeLoadPDF (51-68)
  • deleteDocument (10-27)
  • deleteDocument (10-27)
server/distBE/integrations/google/config.js (2)
  • MAX_ATTACHMENT_PDF_SIZE (21-21)
  • MAX_ATTACHMENT_PDF_SIZE (21-21)
server/distBE/utils.js (2)
  • retryWithBackoff (92-132)
  • retryWithBackoff (92-132)
server/distBE/integrations/google/gmail/index.js (4)
server/distBE/integrations/google/config.js (2)
  • GmailConcurrency (19-19)
  • GmailConcurrency (19-19)
server/distBE/search/types.js (4)
  • mailSchema (4-4)
  • mailSchema (4-4)
  • mailAttachmentSchema (7-7)
  • mailAttachmentSchema (7-7)
server/distBE/integrations/google/worker-utils.js (7)
  • attachments (72-72)
  • filenames (73-73)
  • parseAttachments (71-92)
  • parseAttachments (71-92)
  • attachmentChunks (36-36)
  • getGmailAttachmentChunks (34-69)
  • getGmailAttachmentChunks (34-69)
server/distBE/integrations/google/gmail/quote-parser.js (2)
  • parseEmailBody (67-155)
  • parseEmailBody (67-155)
server/distBE/api/oauth.js (6)
server/distBE/db/connector.js (4)
  • Logger (20-20)
  • google (138-138)
  • updateConnector (173-185)
  • updateConnector (173-185)
server/distBE/db/oauthProvider.js (3)
  • Logger (7-7)
  • getOAuthProvider (24-36)
  • getOAuthProvider (24-36)
server/distBE/integrations/google/index.js (8)
  • email (1658-1658)
  • app (1667-1667)
  • connector (189-189)
  • connector (528-531)
  • connector (719-719)
  • handleGoogleOAuthIngestion (522-658)
  • handleGoogleOAuthIngestion (522-658)
  • error (1441-1441)
server/distBE/db/user.js (2)
  • getUserByEmail (61-67)
  • getUserByEmail (61-67)
server/distBE/errors/index.js (2)
  • NoUserFound (247-252)
  • OAuthCallbackError (240-245)
server/distBE/queue/index.js (2)
  • boss (20-23)
  • boss (20-23)
server/distBE/logger/index.js (3)
server/distBE/eval/evaluate.js (1)
  • start (30-30)
server/distBE/utils/retry-async.js (1)
  • start (5-5)
server/distBE/api/chat.js (4)
  • c (128-128)
  • c (143-143)
  • c (161-161)
  • c (164-164)
server/distBE/api/auth.js (5)
server/distBE/api/admin.js (8)
  • config (10-10)
  • c (19-19)
  • c (66-66)
  • c (68-68)
  • c (80-80)
  • c (117-117)
  • email (81-81)
  • email (118-118)
server/distBE/api/oauth.js (4)
  • config (16-16)
  • c (20-20)
  • c (22-22)
  • email (21-21)
server/distBE/api/search.js (9)
  • config (44-45)
  • c (87-87)
  • c (108-108)
  • c (164-164)
  • c (167-167)
  • email (88-88)
  • email (109-109)
  • email (165-165)
  • userAndWorkspace (169-172)
server/distBE/api/chat.js (13)
  • config (58-58)
  • c (128-128)
  • c (143-143)
  • c (161-161)
  • c (164-164)
  • c (698-698)
  • email (162-162)
  • email (699-699)
  • email (1102-1102)
  • userAndWorkspace (709-709)
  • userAndWorkspace (710-710)
  • userAndWorkspace (1122-1122)
  • userAndWorkspace (1123-1123)
server/distBE/db/user.js (6)
  • userAndWorkspace (14-23)
  • userAndWorkspace (27-27)
  • userAndWorkspace (33-42)
  • userAndWorkspace (46-46)
  • getPublicUserAndWorkspaceByEmail (13-31)
  • getPublicUserAndWorkspaceByEmail (13-31)
server/distBE/ai/provider/bedrock.js (13)
server/distBE/ai/provider/fireworks.js (1)
  • Logger (6-6)
server/distBE/ai/provider/gemini.js (1)
  • Logger (6-6)
server/distBE/ai/provider/ollama.js (1)
  • Logger (6-6)
server/distBE/ai/provider/together.js (1)
  • Logger (6-6)
server/distBE/api/admin.js (2)
  • Logger (17-17)
  • config (10-10)
server/distBE/api/oauth.js (2)
  • Logger (17-17)
  • config (16-16)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/ai/provider/base.js (1)
  • config (4-4)
server/distBE/api/chat.js (2)
  • config (58-58)
  • cost (723-723)
server/distBE/api/proxy.js (1)
  • response (3-3)
server/distBE/utils/index.js (2)
  • calculateCost (10-14)
  • calculateCost (10-14)
server/distBE/ai/mappers.js (2)
  • modelDetailsMap (2-234)
  • modelDetailsMap (2-234)
server/distBE/config.js (2)
  • EndThinkingToken (83-83)
  • StartThinkingToken (82-82)
server/distBE/server.js (7)
server/distBE/api/admin.js (18)
  • clientId (89-89)
  • clientSecret (90-90)
  • config (10-10)
  • Logger (17-17)
  • app (92-92)
  • app (128-128)
  • c (19-19)
  • c (66-66)
  • c (68-68)
  • c (80-80)
  • c (117-117)
  • CreateOAuthProvider (79-114)
  • CreateOAuthProvider (79-114)
  • email (81-81)
  • email (118-118)
  • user (25-25)
  • user (86-86)
  • user (123-123)
server/distBE/api/oauth.js (9)
  • config (16-16)
  • Logger (17-17)
  • c (20-20)
  • c (22-22)
  • OAuthCallback (18-81)
  • OAuthCallback (18-81)
  • email (21-21)
  • connectorId (51-51)
  • JSON (26-26)
server/distBE/api/search.js (11)
  • config (44-45)
  • Logger (43-43)
  • c (87-87)
  • c (108-108)
  • c (164-164)
  • c (167-167)
  • SearchApi (107-162)
  • SearchApi (107-162)
  • email (88-88)
  • email (109-109)
  • email (165-165)
server/distBE/api/chat.js (15)
  • config (58-58)
  • c (128-128)
  • c (143-143)
  • c (161-161)
  • c (164-164)
  • GetChatApi (103-124)
  • GetChatApi (103-124)
  • email (162-162)
  • email (699-699)
  • email (1102-1102)
  • token (349-349)
  • token (443-443)
  • token (597-597)
  • message (285-285)
  • message (526-526)
server/distBE/config.js (1)
  • postOauthRedirect (9-9)
server/distBE/db/workspace.js (5)
  • name (54-54)
  • getWorkspaceByDomain (18-24)
  • getWorkspaceByDomain (18-24)
  • createWorkspace (51-65)
  • createWorkspace (51-65)
server/distBE/db/user.js (4)
  • getUserByEmail (61-67)
  • getUserByEmail (61-67)
  • createUser (68-88)
  • createUser (68-88)
server/distBE/eval/evaluate.js (4)
server/distBE/eval/qrelsToTrecFormat.js (2)
  • Logger (9-9)
  • processedResultsData (29-29)
server/distBE/eval/processDataForVespa.js (2)
  • Logger (20-20)
  • user (34-34)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/search/vespa.js (2)
  • searchVespa (320-351)
  • searchVespa (320-351)
server/distBE/eval/processDataForVespa.js (19)
server/distBE/eval/evaluate.js (5)
  • expectedArgsLen (3-3)
  • requiredArgs (4-4)
  • argMap (9-9)
  • Logger (29-29)
  • user (33-33)
server/distBE/eval/qrelsToTrecFormat.js (5)
  • expectedArgsLen (3-3)
  • requiredArgs (4-4)
  • argMap (10-10)
  • Logger (9-9)
  • filePath (5-5)
server/distBE/server.js (3)
  • Logger (72-72)
  • user (180-180)
  • user (222-230)
server/distBE/api/admin.js (6)
  • Logger (17-17)
  • path (60-60)
  • user (25-25)
  • user (86-86)
  • user (123-123)
  • data (126-126)
server/distBE/api/oauth.js (2)
  • Logger (17-17)
  • JSON (26-26)
server/distBE/db/client.js (1)
  • Logger (6-6)
server/distBE/db/connector.js (1)
  • Logger (20-20)
server/distBE/api/search.js (1)
  • Logger (43-43)
server/distBE/db/oauthProvider.js (1)
  • Logger (7-7)
server/distBE/db/seed.js (1)
  • Logger (7-7)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/integrations/google/sync.js (5)
  • user (366-366)
  • user (526-526)
  • user (1045-1045)
  • data (358-358)
  • data (1037-1037)
server/distBE/integrations/google/gmail/index.js (2)
  • data (260-260)
  • chunks (152-152)
server/distBE/integrations/google/index.js (6)
  • data (188-188)
  • chunks (931-931)
  • chunks (1227-1227)
  • chunks (1297-1297)
  • filePath (1408-1408)
  • error (1441-1441)
server/distBE/utils.js (1)
  • data (16-16)
server/distBE/chunks.js (4)
  • chunks (23-23)
  • chunks (97-97)
  • chunkDocument (95-105)
  • chunkDocument (95-105)
server/distBE/integrations/google/utils.js (3)
  • chunks (113-113)
  • chunks (163-163)
  • filePath (299-299)
server/distBE/integrations/google/pdf-utils.js (2)
  • filePath (29-29)
  • error (57-57)
server/distBE/api/chat.js (1)
  • item (265-265)
server/distBE/db/oauthProvider.js (2)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/db/schema.js (2)
  • oauthProviders (123-149)
  • oauthProviders (123-149)
server/distBE/ai/provider/ollama.js (6)
server/distBE/ai/provider/bedrock.js (1)
  • Logger (8-8)
server/distBE/ai/provider/fireworks.js (1)
  • Logger (6-6)
server/distBE/ai/provider/gemini.js (1)
  • Logger (6-6)
server/distBE/ai/provider/together.js (1)
  • Logger (6-6)
server/distBE/ai/provider/index.js (1)
  • Logger (26-26)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/eval/qrelsToTrecFormat.js (3)
server/distBE/eval/evaluate.js (10)
  • args (2-2)
  • expectedArgsLen (3-3)
  • requiredArgs (4-4)
  • Logger (29-29)
  • argMap (9-9)
  • readline (28-28)
  • processedResultsData (31-31)
  • fileStream (56-56)
  • rl (57-60)
  • columns (62-62)
server/distBE/eval/processDataForVespa.js (7)
  • args (2-2)
  • expectedArgsLen (3-3)
  • requiredArgs (4-4)
  • Logger (20-20)
  • argMap (9-9)
  • fileStream (53-53)
  • rl (54-57)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/ai/eval.js (3)
server/distBE/search/vespa.js (7)
  • data (460-460)
  • timestamp (477-477)
  • config (568-568)
  • response (450-455)
  • email (93-93)
  • searchVespa (320-351)
  • searchVespa (320-351)
server/distBE/ai/context.js (2)
  • userContext (284-295)
  • userContext (284-295)
server/distBE/db/user.js (6)
  • userAndWorkspace (14-23)
  • userAndWorkspace (27-27)
  • userAndWorkspace (33-42)
  • userAndWorkspace (46-46)
  • getUserAndWorkspaceByEmail (32-48)
  • getUserAndWorkspaceByEmail (32-48)
server/distBE/ai/context.js (1)
server/distBE/search/types.js (10)
  • fileSchema (2-2)
  • fileSchema (2-2)
  • userSchema (3-3)
  • userSchema (3-3)
  • mailSchema (4-4)
  • mailSchema (4-4)
  • mailAttachmentSchema (7-7)
  • mailAttachmentSchema (7-7)
  • eventSchema (5-5)
  • eventSchema (5-5)
server/distBE/ai/provider/together.js (6)
server/distBE/ai/provider/bedrock.js (1)
  • Logger (8-8)
server/distBE/ai/provider/fireworks.js (1)
  • Logger (6-6)
server/distBE/ai/provider/gemini.js (1)
  • Logger (6-6)
server/distBE/ai/provider/ollama.js (1)
  • Logger (6-6)
server/distBE/ai/provider/index.js (1)
  • Logger (26-26)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/chunks.js (3)
server/distBE/eval/processDataForVespa.js (1)
  • chunks (36-36)
server/distBE/integrations/google/gmail/index.js (4)
  • chunks (152-152)
  • body (151-151)
  • body (177-177)
  • body (233-233)
server/distBE/integrations/google/utils.js (2)
  • chunks (113-113)
  • chunks (163-163)
server/distBE/ai/provider/openai.js (2)
server/distBE/utils/index.js (2)
  • calculateCost (10-14)
  • calculateCost (10-14)
server/distBE/ai/mappers.js (2)
  • modelDetailsMap (2-234)
  • modelDetailsMap (2-234)
server/distBE/ai/provider/gemini.js (7)
server/distBE/ai/provider/bedrock.js (1)
  • Logger (8-8)
server/distBE/ai/provider/fireworks.js (1)
  • Logger (6-6)
server/distBE/ai/provider/ollama.js (1)
  • Logger (6-6)
server/distBE/ai/provider/together.js (1)
  • Logger (6-6)
server/distBE/ai/provider/index.js (1)
  • Logger (26-26)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/api/chat.js (1)
  • cost (723-723)
server/distBE/config.js (1)
server/distBE/server.js (1)
  • postOauthRedirect (69-69)
server/distBE/integrations/google/config.js (1)
server/distBE/api/admin.js (1)
  • scopes (91-91)
server/distBE/integrations/google/pdf-utils.js (5)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/integrations/google/index.js (15)
  • downloadDir (1398-1398)
  • downloadDir (1398-1398)
  • filePath (1408-1408)
  • file (1871-1871)
  • res (117-124)
  • res (146-156)
  • res (404-418)
  • res (1410-1420)
  • res (1753-1774)
  • res (1928-1940)
  • error (1441-1441)
  • message (678-678)
  • message (680-680)
  • message (687-687)
  • name (1657-1657)
server/distBE/integrations/google/utils.js (2)
  • filePath (299-299)
  • file (79-82)
server/distBE/errors/index.js (1)
  • DeleteDocumentError (277-286)
server/distBE/integrations/google/worker-utils.js (1)
  • fileName (44-44)
server/distBE/integrations/google/sync.js (6)
server/distBE/db/connector.js (8)
  • Logger (20-20)
  • google (138-138)
  • getOAuthConnectorWithCredentials (106-156)
  • getOAuthConnectorWithCredentials (106-156)
  • res (58-68)
  • res (74-78)
  • res (107-111)
  • res (158-162)
server/distBE/integrations/google/index.js (47)
  • sheets (1362-1362)
  • getSpreadsheet (1195-1219)
  • getSpreadsheet (1195-1219)
  • error (1441-1441)
  • email (1658-1658)
  • file (1871-1871)
  • data (188-188)
  • response (1156-1167)
  • response (1590-1603)
  • response (1613-1626)
  • syncJobs (175-179)
  • syncJob (187-187)
  • syncJob (233-233)
  • connector (189-189)
  • connector (528-531)
  • connector (719-719)
  • gmail (1879-1879)
  • getJoiningLink (311-328)
  • getJoiningLink (311-328)
  • getJoiningLink (433-433)
  • getAttendeesOfEvent (329-351)
  • getAttendeesOfEvent (329-351)
  • getAttendeesOfEvent (434-435)
  • getAttachments (352-368)
  • getAttachments (352-368)
  • getAttachments (436-438)
  • getEventStartTime (372-386)
  • getEventStartTime (372-386)
  • getEventStartTime (439-439)
  • getTextFromEventDescription (258-260)
  • getTextFromEventDescription (258-260)
  • attendeesInfo (333-333)
  • attendeesNames (334-334)
  • attachmentsInfo (356-356)
  • getUniqueEmails (369-371)
  • getUniqueEmails (369-371)
  • attendeesEmails (335-335)
  • userEmail (532-532)
  • res (117-124)
  • res (146-156)
  • res (404-418)
  • res (1410-1420)
  • res (1753-1774)
  • res (1928-1940)
  • message (678-678)
  • message (680-680)
  • message (687-687)
server/distBE/integrations/google/utils.js (8)
  • sheets (211-211)
  • permissions (260-260)
  • file (79-82)
  • getFile (75-99)
  • getFile (75-99)
  • getFileContent (100-143)
  • getFileContent (100-143)
  • entity (221-221)
server/distBE/integrations/google/gmail/index.js (6)
  • permissions (147-149)
  • data (260-260)
  • gmail (30-34)
  • parseMail (110-231)
  • parseMail (110-231)
  • labels (114-114)
server/distBE/db/syncJob.js (4)
  • getAppSyncJobs (21-27)
  • getAppSyncJobs (21-27)
  • updateSyncJob (28-43)
  • updateSyncJob (28-43)
server/distBE/db/user.js (2)
  • getUserById (89-99)
  • getUserById (89-99)
server/distBE/db/user.js (1)
server/distBE/db/schema.js (10)
  • users (51-75)
  • users (51-75)
  • workspaces (33-48)
  • workspaces (33-48)
  • userPublicSchema (309-316)
  • userPublicSchema (309-316)
  • workspacePublicSchema (317-322)
  • workspacePublicSchema (317-322)
  • selectUserSchema (307-307)
  • selectUserSchema (307-307)
server/distBE/utils.js (4)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/search/vespa.js (1)
  • data (460-460)
server/distBE/integrations/google/sync.js (3)
  • data (358-358)
  • data (1037-1037)
  • retryWithBackoff (1051-1051)
server/distBE/api/auth.js (1)
  • c (7-7)
server/distBE/integrations/google/tracking.js (8)
server/distBE/server.js (1)
  • email (181-181)
server/distBE/api/auth.js (1)
  • email (8-8)
server/distBE/api/admin.js (2)
  • email (81-81)
  • email (118-118)
server/distBE/ai/eval.js (4)
  • email (189-189)
  • email (266-266)
  • email (389-389)
  • email (541-541)
server/distBE/api/oauth.js (1)
  • email (21-21)
server/distBE/api/search.js (3)
  • email (88-88)
  • email (109-109)
  • email (165-165)
server/distBE/api/chat.js (3)
  • email (162-162)
  • email (699-699)
  • email (1102-1102)
server/distBE/integrations/google/index.js (1)
  • email (1658-1658)
server/distBE/db/chat.js (1)
server/distBE/db/schema.js (8)
  • chats (214-239)
  • chats (214-239)
  • selectChatSchema (326-326)
  • selectChatSchema (326-326)
  • selectMessageSchema (331-331)
  • selectMessageSchema (331-331)
  • selectPublicChatSchema (338-341)
  • selectPublicChatSchema (338-341)
server/distBE/integrations/google/utils.js (3)
server/distBE/integrations/google/index.js (13)
  • drive (1459-1459)
  • drive (1750-1750)
  • drive (1924-1924)
  • file (1871-1871)
  • error (1441-1441)
  • chunks (931-931)
  • chunks (1227-1227)
  • chunks (1297-1297)
  • downloadPDF (1407-1434)
  • downloadPDF (1407-1434)
  • downloadDir (1398-1398)
  • downloadDir (1398-1398)
  • filePath (1408-1408)
server/distBE/integrations/google/pdf-utils.js (7)
  • file (30-30)
  • error (57-57)
  • downloadPDF (28-49)
  • downloadPDF (28-49)
  • downloadDir (9-9)
  • downloadDir (9-9)
  • filePath (29-29)
server/distBE/integrations/google/config.js (2)
  • MAX_GD_PDF_SIZE (13-13)
  • MAX_GD_PDF_SIZE (13-13)
server/distBE/doc.js (2)
server/distBE/chunks.js (1)
  • cleanText (15-18)
server/distBE/integrations/google/utils.js (2)
  • footnotes (110-110)
  • headerFooter (111-111)
server/distBE/api/search.js (6)
server/distBE/ai/provider/index.js (2)
  • Logger (26-26)
  • config (5-5)
server/distBE/search/vespa.js (10)
  • config (568-568)
  • email (93-93)
  • autocomplete (112-166)
  • autocomplete (112-166)
  • getTimestamp (549-564)
  • getTimestamp (549-564)
  • groupVespaSearch (301-319)
  • groupVespaSearch (301-319)
  • searchVespa (320-351)
  • searchVespa (320-351)
server/distBE/search/mappers.js (2)
  • VespaAutocompleteResponseToResult (79-138)
  • VespaAutocompleteResponseToResult (79-138)
server/distBE/db/user.js (6)
  • userAndWorkspace (14-23)
  • userAndWorkspace (27-27)
  • userAndWorkspace (33-42)
  • userAndWorkspace (46-46)
  • getPublicUserAndWorkspaceByEmail (13-31)
  • getPublicUserAndWorkspaceByEmail (13-31)
server/distBE/ai/context.js (2)
  • userContext (284-295)
  • userContext (284-295)
server/distBE/search/types.js (2)
  • userSchema (3-3)
  • userSchema (3-3)
server/distBE/queue/index.js (3)
server/distBE/integrations/google/sync.js (15)
  • config (369-369)
  • config (498-498)
  • config (529-529)
  • config (573-573)
  • config (604-604)
  • config (648-648)
  • config (1049-1049)
  • handleGoogleOAuthChanges (356-670)
  • handleGoogleOAuthChanges (356-670)
  • errorMessage (496-496)
  • errorMessage (571-571)
  • errorMessage (646-646)
  • errorMessage (1192-1192)
  • errorMessage (1265-1265)
  • errorMessage (1338-1338)
server/distBE/integrations/google/index.js (5)
  • error (1441-1441)
  • errorMessage (224-224)
  • errorMessage (1103-1103)
  • syncGoogleWorkspace (173-257)
  • syncGoogleWorkspace (173-257)
server/distBE/integrations/google/utils.js (2)
  • checkDownloadsFolder (287-308)
  • checkDownloadsFolder (287-308)
server/distBE/api/chat.js (13)
server/distBE/ai/provider/bedrock.js (2)
  • config (10-10)
  • Logger (8-8)
server/distBE/ai/provider/index.js (17)
  • config (5-5)
  • Logger (26-26)
  • messages (299-308)
  • messages (392-394)
  • messages (414-416)
  • messages (432-434)
  • messages (445-455)
  • messages (483-485)
  • messages (535-537)
  • messages (563-565)
  • messages (583-585)
  • messages (612-614)
  • messages (658-660)
  • messages (678-680)
  • messages (697-699)
  • messages (726-728)
  • messages (762-764)
server/distBE/ai/provider/fireworks.js (1)
  • Logger (6-6)
server/distBE/ai/provider/gemini.js (1)
  • Logger (6-6)
server/distBE/ai/provider/ollama.js (1)
  • Logger (6-6)
server/distBE/ai/provider/together.js (1)
  • Logger (6-6)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/ai/provider/fireworksClient.js (1)
  • messages (135-135)
server/distBE/db/chat.js (10)
  • getChatByExternalId (36-45)
  • getChatByExternalId (36-45)
  • updateChatByExternalId (46-57)
  • updateChatByExternalId (46-57)
  • deleteMessagesByChatId (68-77)
  • deleteMessagesByChatId (68-77)
  • deleteChatByExternalId (58-67)
  • deleteChatByExternalId (58-67)
  • getPublicChats (90-99)
  • getPublicChats (90-99)
server/distBE/db/message.js (8)
  • getChatMessages (24-31)
  • getChatMessages (24-31)
  • getMessageByExternalId (40-49)
  • getMessageByExternalId (40-49)
  • getChatMessagesBefore (32-39)
  • getChatMessagesBefore (32-39)
  • updateMessage (50-55)
  • updateMessage (50-55)
server/distBE/db/schema.js (4)
  • selectPublicChatSchema (338-341)
  • selectPublicChatSchema (338-341)
  • selectPublicMessagesSchema (337-337)
  • selectPublicMessagesSchema (337-337)
server/distBE/utils.js (2)
  • getErrorMessage (60-64)
  • getErrorMessage (60-64)
server/distBE/ai/context.js (2)
  • answerContextMap (211-230)
  • answerContextMap (211-230)
server/distBE/db/connector.js (3)
server/distBE/db/oauthProvider.js (5)
  • externalId (9-9)
  • res (25-29)
  • res (38-42)
  • getOAuthProviderByConnectorId (37-49)
  • getOAuthProviderByConnectorId (37-49)
server/distBE/db/schema.js (4)
  • connectors (87-120)
  • connectors (87-120)
  • selectConnectorSchema (284-287)
  • selectConnectorSchema (284-287)
server/distBE/errors/index.js (6)
  • ConnectionInsertionError (233-238)
  • NoConnectorsFound (226-231)
  • NoOauthConnectorFound (219-224)
  • MissingOauthConnectorCredentialsError (194-203)
  • FetchProviderFailed (212-217)
  • UpdateConnectorFailed (205-210)
server/distBE/types.js (4)
server/distBE/api/auth.js (1)
  • config (5-5)
server/distBE/api/oauth.js (1)
  • config (16-16)
server/distBE/api/search.js (1)
  • config (44-45)
server/distBE/db/oauthProvider.js (2)
  • createOAuthProvider (8-23)
  • createOAuthProvider (8-23)
server/distBE/shared/types.js (1)
server/distBE/search/types.js (24)
  • fileSchema (2-2)
  • fileSchema (2-2)
  • entitySchema (84-91)
  • entitySchema (84-91)
  • userSchema (3-3)
  • userSchema (3-3)
  • userQuerySchema (6-6)
  • userQuerySchema (6-6)
  • mailSchema (4-4)
  • mailSchema (4-4)
  • mailAttachmentSchema (7-7)
  • mailAttachmentSchema (7-7)
  • eventSchema (5-5)
  • eventSchema (5-5)
  • VespaFileSchema (103-118)
  • VespaFileSchema (103-118)
  • VespaEventSchema (214-239)
  • VespaEventSchema (214-239)
  • VespaUserSchema (129-162)
  • VespaUserSchema (129-162)
  • MailResponseSchema (473-488)
  • MailResponseSchema (473-488)
  • MailAttachmentResponseSchema (489-504)
  • MailAttachmentResponseSchema (489-504)
server/distBE/db/syncJob.js (1)
server/distBE/db/schema.js (6)
  • insertSyncJob (288-294)
  • insertSyncJob (288-294)
  • syncJobs (152-181)
  • syncJobs (152-181)
  • selectSyncJobSchema (295-297)
  • selectSyncJobSchema (295-297)
server/distBE/db/schema.js (3)
server/distBE/utils/encryption.js (1)
  • Encryption (2-41)
server/distBE/db/customType.js (2)
  • encryptedText (7-35)
  • encryptedText (7-35)
server/distBE/types.js (2)
  • SyncConfigSchema (123-123)
  • SyncConfigSchema (123-123)
server/distBE/search/vespa.js (1)
server/distBE/search/types.js (12)
  • fileSchema (2-2)
  • fileSchema (2-2)
  • userSchema (3-3)
  • userSchema (3-3)
  • mailSchema (4-4)
  • mailSchema (4-4)
  • eventSchema (5-5)
  • eventSchema (5-5)
  • mailAttachmentSchema (7-7)
  • mailAttachmentSchema (7-7)
  • userQuerySchema (6-6)
  • userQuerySchema (6-6)
server/distBE/search/vespaClient.js (3)
server/distBE/logger/index.js (2)
  • getLogger (18-42)
  • getLogger (18-42)
server/distBE/utils.js (2)
  • getErrorMessage (60-64)
  • getErrorMessage (60-64)
server/distBE/search/mappers.js (1)
  • handleVespaGroupResponse (139-167)
server/distBE/ai/provider/index.js (4)
server/distBE/config.js (14)
  • AwsAccessKey (24-24)
  • AwsSecretKey (25-25)
  • OpenAIKey (26-26)
  • OllamaModel (27-27)
  • TogetherAIModel (28-28)
  • TogetherApiKey (31-31)
  • FireworksAIModel (29-29)
  • FireworksApiKey (32-32)
  • GeminiAIModel (30-30)
  • GeminiApiKey (33-33)
  • defaultBestModel (23-23)
  • defaultFastModel (22-22)
  • isReasoning (34-34)
  • EndThinkingToken (83-83)
server/distBE/ai/mappers.js (2)
  • ModelToProviderMap (235-255)
  • ModelToProviderMap (235-255)
server/distBE/ai/types.js (6)
  • QueryContextRank (98-101)
  • QueryContextRank (98-101)
  • QueryAnalysisSchema (50-54)
  • QueryAnalysisSchema (50-54)
  • QueryRouterResponseSchema (81-97)
  • QueryRouterResponseSchema (81-97)
server/distBE/utils.js (2)
  • getErrorMessage (60-64)
  • getErrorMessage (60-64)
server/distBE/ai/prompts.js (2)
server/distBE/utils/index.js (1)
  • getDateForAI (1-9)
server/distBE/ai/context.js (2)
  • userContext (284-295)
  • userContext (284-295)
server/distBE/integrations/google/index.js (6)
server/distBE/db/connector.js (8)
  • Logger (20-20)
  • res (58-68)
  • res (74-78)
  • res (107-111)
  • res (158-162)
  • getConnector (73-92)
  • getConnector (73-92)
  • google (138-138)
server/distBE/integrations/google/gmail/index.js (10)
  • nextPageToken (36-36)
  • data (260-260)
  • subject (133-133)
  • historyId (39-39)
  • handleGmailIngestion (27-84)
  • handleGmailIngestion (27-84)
  • limit (37-37)
  • batchSize (28-28)
  • gmail (30-34)
  • profile (38-38)
server/distBE/db/syncJob.js (6)
  • getAppSyncJobs (21-27)
  • getAppSyncJobs (21-27)
  • updateSyncJob (28-43)
  • updateSyncJob (28-43)
  • insertSyncJob (5-20)
  • insertSyncJob (5-20)
server/distBE/integrations/google/utils.js (4)
  • createJwtClient (279-286)
  • createJwtClient (279-286)
  • entity (221-221)
  • file (79-82)
server/distBE/integrations/google/ws.js (2)
  • wsConnections (1-1)
  • wsConnections (1-1)
server/distBE/integrations/google/config.js (14)
  • ServiceAccountUserConcurrency (17-17)
  • ServiceAccountUserConcurrency (17-17)
  • MAX_GD_SLIDES_TEXT_LEN (16-16)
  • MAX_GD_SLIDES_TEXT_LEN (16-16)
  • MAX_GD_SHEET_TEXT_LEN (15-15)
  • MAX_GD_SHEET_TEXT_LEN (15-15)
  • MAX_GD_SHEET_ROWS (14-14)
  • MAX_GD_SHEET_ROWS (14-14)
  • PDFProcessingConcurrency (20-20)
  • PDFProcessingConcurrency (20-20)
  • MAX_GD_PDF_SIZE (13-13)
  • MAX_GD_PDF_SIZE (13-13)
  • GoogleDocsConcurrency (18-18)
  • GoogleDocsConcurrency (18-18)
server/distBE/search/types.js (1)
server/distBE/integrations/google/utils.js (1)
  • entity (221-221)
🔇 Additional comments (85)
server/distBE/cleanup.js (2)

1-33: Check list of elements for thoroughness
These elements are comprehensively listed to strip potentially harmful or undesired DOM nodes. Confirm that this covers all relevant embedded content types needed for your use case (e.g., custom elements or newer HTML5 elements).


45-56:

❓ Verification inconclusive

Potential performance overhead
Looping over every element in the document and removing them on-the-fly could be expensive for large or deeply nested DOMs. Verify that this cleanup is intentionally scoped and tested for performance.


🏁 Script executed:

#!/bin/bash
# Checking usage references of cleanup function
rg -A 5 "cleanup("

Length of output: 87


Action: Verify Cleanup Function Usage and Performance Impact

It appears that the original reference search for the cleanup function was affected by a regex error (unescaped (). Please re-run the above script (or verify manually) to confirm if there are any usage references for the cleanup function. Additionally, since the function iterates over the entire DOM to remove elements and attributes on-the-fly, double-check that this cleanup is intentionally scoped and that performance has been appropriately tested for large or deeply nested DOMs.


🏁 Script executed:

#!/bin/bash
# Re-check cleanup function usage with corrected regex pattern (escaped parenthesis)
rg -A 5 "cleanup\(" .

Length of output: 570


Review Comment Revision: Performance Overhead in Cleanup Function

The cleanup function in both server/cleanup.ts and its transpiled version server/distBE/cleanup.js iterates over every DOM element using document.querySelectorAll("*") and performs removal operations. This could introduce performance overhead, especially for large or deeply nested DOMs.

  • Confirm that this function is only applied in scenarios with manageable DOM sizes or include performance tests to validate its impact.
  • Consider documenting the expected scope and limitations of this cleanup process to ensure maintainability.
server/distBE/vespa/replaceDIMS.js (1)

10-21:

✅ Verification successful

Regex approach might be too broad
This regex replaces any v[...] occurrences with the given dimension. Verify that no unrelated placeholders or texts will be unintentionally changed.


🏁 Script executed:

#!/bin/bash
# Searching for usage of v[...] patterns across .sd files to ensure they are exclusively for dimension placeholders
fd -e sd --exec rg -A 2 "v\[\w+\]"

Length of output: 1662


Regex Pattern Verified – No Unintended Replacements Detected

The regex /v\[(?:DIMS|\d+)\]/g in server/distBE/vespa/replaceDIMS.js (lines 10–21) was inspected against our Vespa schema files. The search revealed that the only matches are the expected dimension placeholders (e.g., v[DIMS]), with no evidence of unrelated or accidental matches. This confirms that the regex is scoped appropriately for its intended purpose.

  • File Reviewed: server/distBE/vespa/replaceDIMS.js
  • Verified Occurrences: The pattern appears only in valid dimension placeholder contexts in our .sd files.

There is no further action required regarding this regex.

server/distBE/doc.js (5)

14-20: Minimal error handling
Accessing element.textRun?.content is mostly safe, but you could add a minor check or fallback for clarity. Overall, the implementation is fine.


85-89: Markdown heading logic
Generating headings via # symbols based on HEADING_x styles is straightforward and correct.


207-217: Footnote references mismatch repeated
As noted, footnoteNumber references may not align with footnoteId in extractFootnotes. A consistent scheme is needed.


219-236: Headers and footers extraction
Logic is consistent with footnotes. Works well for capturing header/footer content in a straightforward manner.


110-151:

❓ Verification inconclusive

Verify performance on large documents
This recursive extraction can be heavy for deeply nested or lengthy documents. Suggest performance testing or chunk processing.


🏁 Script executed:

#!/bin/bash
# Searching for usage of extractText to understand possible large doc scenarios
rg -A 5 "extractText("

Length of output: 99


Below is a revised review comment that reflects the need for further verification using an updated search query. In particular, the original rg command failed due to an unescaped parenthesis. We now recommend re-running the search with the proper escaping to verify usage patterns of extractText for performance considerations. Please verify the output manually after running the re-run command.


Action Required: Verify Large Document Performance

The recursive extraction in extractText may be performance intensive for deeply nested or long documents. The initial search for its usage produced a regex error due to an unescaped parenthesis. Please re-run the following command to search for its usage across the codebase and manually verify if there are scenarios where chunk processing or other optimizations might be required:

Ensure the output is checked for instances of heavy usage in large document contexts.


🏁 Script executed:

#!/bin/bash
# Re-run the search for extractText usage with properly escaped parenthesis.
rg -A 5 "extractText\("

Length of output: 4878


Action Required: Assess Recursive Extraction for Performance

Our verification indicates that the recursive function extractText is widely used across the codebase (e.g., in server/doc.ts, server/distBE/doc.js, and within Google integration modules). This reinforces the original concern: in cases of deeply nested or large documents, the recursion may become a performance bottleneck. Although one instance already employs chunk processing (via chunkDocument), it remains essential to profile and test these paths to ensure acceptable performance.

  • Action Item: Run performance tests on large and heavily nested documents to validate that extractText scales efficiently.
  • Consideration: Investigate whether additional chunking or iterative approaches might further optimize extraction where needed.
server/distBE/drizzle.config.js (1)

5-5: Verify TypeScript schema file compatibility

The schema path points to a TypeScript file (./db/schema.ts), but this is a JavaScript config file. Make sure the TypeScript file is properly compiled and available at runtime to avoid potential issues with the ORM configuration.

server/distBE/mime.js (1)

1-186: Comprehensive MIME type mapping looks good.

The mapping is extensive and covers a wide range of file types with appropriate keywords for search functionality. The comments for ambiguous mappings provide helpful context.

server/package.json (2)

42-42: Addition of dotenv for environment variable management.

Adding dotenv is a good practice for managing environment variables in Node.js applications.


8-8:

✅ Verification successful

Verify that added dependencies are compatible with the Node environment.

The addition of @faker-js/faker, p-queue, and tsc-alias seems aligned with the PR objective to run server JS files in Node.

Also applies to: 12-13


🏁 Script executed:

#!/bin/bash
# Check if the added dependencies are compatible with Node.js

echo "Checking Node.js compatibility for added dependencies:"
npm info @faker-js/faker engines
npm info p-queue engines
npm info tsc-alias engines

Length of output: 285


Dependency Compatibility Verified

The shell script confirmed that:

  • @faker-js/faker requires Node >=18.0.0
  • p-queue requires Node >=18
  • tsc-alias requires Node >=16.20.2

Given our server targets Node (with an expected version of 18 or later), these dependencies are compatible as used in server/package.json (Lines 8, and 12–13). No further modifications are needed.

server/distBE/ai/mappers.js (2)

235-255: Model to provider mapping is clear and consistent.

The mapping between models and their providers is well-structured and aligns with the models defined in the modelDetailsMap.


2-234:

❓ Verification inconclusive

Comprehensive model details map with pricing information.

The modelDetailsMap provides detailed pricing information for various AI models, which is essential for cost tracking and billing. The structure is consistent for most models.

Some models have inconsistent pricing structures. Verify if this is intentional:


🏁 Script executed:

#!/bin/bash
# Check for models that only have onDemand pricing but not batch pricing
grep -A 10 "onDemand" server/distBE/ai/mappers.js | grep -B 10 -A 2 "}"

Length of output: 6157


Pricing Structure Consistency – Verification Needed

After running the verification script, it’s clear that several models in server/distBE/ai/mappers.js use only onDemand pricing without a matching batch pricing structure. Specifically, please verify if the following models are intentionally configured this way:

  • [Models.CohereCmdRPlus]
  • [Models.CohereCmdR]
  • [Models.DeepSeek_R1]
  • [Models.Mistral_Large]

If these omissions are by design (e.g., due to the model or provider not supporting batch pricing), consider adding an inline comment to clarify this intentional difference for future maintainers.

server/distBE/db/syncJob.js (2)

1-20: LGTM: Well-structured implementation of insertSyncJob function

The function properly generates a unique external ID, inserts the job into the database, and validates the result against the schema. Error handling is appropriate and the function follows best practices for database operations.


21-27: LGTM: getAppSyncJobs implementation looks good

The function correctly filters jobs by app and authentication type and validates the results with Zod. Good use of the array schema for validating multiple results.

server/tsconfig.json (2)

17-19: Enabling JavaScript output for Node.js execution

The commenting out of these TypeScript-only options (allowImportingTsExtensions, verbatimModuleSyntax, and noEmit) allows the compiler to generate JavaScript files instead of just type-checking. This aligns with the PR goal of running server JS files in Node.


28-29: Changed output directory to support Node.js execution

Changing the output directory from the commented-out ./distBE to the active ./distFinal designates where the compiled JavaScript files will be placed. This is essential for the "Run Server JS files in Node" objective.

server/distBE/config.js (2)

36-38: Plan to address the TODO comment in a future update

The TODO comment indicates a planned refactoring of model environment variables to simplify configuration. Make sure to track this for a future update.

This comment indicates upcoming changes to environment variable handling for models. Consider creating a tracking issue to ensure this refactoring is completed.


40-81: LGTM: Well-structured model priority logic

The priority logic for AI models (AWS > OpenAI > Ollama > Together > Fireworks > Gemini) is well implemented with clear conditional blocks. The code properly sets default models based on available credentials.

server/distBE/db/syncHistory.js (1)

1-18: LGTM: Well-implemented insertSyncHistory function

The function follows the same pattern as other database operations, generating a unique ID, inserting the record, and validating the result. The error handling is appropriate and the function returns the validated data.

server/distBE/db/customType.js (1)

2-2:

✅ Verification successful

Validate relative import path for Encryption module.

Ensure "../utils/encryption.js" is correct and won't cause import resolution issues, especially if the file structure changes.


🏁 Script executed:

#!/bin/bash
# Description: Verify that encryption.js exists in the expected location.
fd encryption.js --exec ls -l {}

Length of output: 121


Relative Import Path Verified

The relative import in server/distBE/db/customType.js at line 2 correctly points to ../utils/encryption.js since the encryption.js file is located at server/distBE/utils/encryption.js. No changes are needed at this time. Please remember to adjust the import if the project’s file structure is refactored in the future.

server/distBE/eval/evaluate.js (1)

37-55: Handle JSON parse failures in input lines.

If a line in the queries file isn't valid JSON, JSON.parse will throw an error. Consider wrapping the parse call in a try/catch to gracefully handle malformed inputs instead of crashing.

Would you like me to generate a fix so the script can skip or report invalid JSON lines?

server/distBE/logger/index.js (1)

63-82: Consider clarifying request ID handling.

Currently, the middleware overwrites requestId if already present in c.req. Ensure the fallback logic doesn't unintentionally replace a valid pre-existing ID. If user code sets c.req.requestId for some reason, confirm you want to overwrite it with a new UUID or keep the one already set.

server/distBE/db/seed.js (1)

7-7: Validate logging usage of the Subsystem.Db.

Make sure your other code references and uses the correct subsystem name. If this subsystem key is inconsistent elsewhere, update it for consistency.

server/distBE/search/mappers.js (1)

139-167: Verify grouping assumptions for multiple groups.

handleVespaGroupResponse only handles the first group:root and first grouplist:app. If multiple root or app groups appear in the response, they will be ignored. Confirm that this is intentional. Otherwise, consider iterating through all group entries.

server/distBE/api/oauth.js (1)

40-44: Confirm user existence checks align with your business logic.

You are actively throwing a NoUserFound error upon an empty user array. Ensure the flow to create new users (if applicable) or handle guest access is in place, as this block terminates the callback otherwise.

server/distBE/ai/provider/gemini.js (2)

19-26: Confirm the intended role mapping for chat messages.

In converse, the role remains as-is. In converseStream, "assistant" is remapped to "model". Verify this difference is intentional to avoid inconsistent conversation flows.

Also applies to: 56-59


39-41: Check whether the text access is synchronous.

response.response.text() may be asynchronous or a promise-based method, depending on the underlying SDK. Confirm that calling it directly returns a string; otherwise, it should be handled with await.

server/distBE/ai/provider/bedrock.js (1)

11-14: Validate the client object in the constructor.
Currently, the constructor accepts the client parameter without checks. Consider validating or defaulting this object to avoid null references or misconfiguration.

server/distBE/server.js (1)

61-63: Verify environment variable configurations.
You are loading environment variables via dotenv.config(). Confirm that essential variables like GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET, and JWT_SECRET are set for all deployment environments to avoid runtime failures.

server/distBE/integrations/google/gmail/quote-parser.js (3)

1-30: Check for internationalized email cases.
The regexes defined for headers and signatures may fail to detect email patterns in other locales. Consider adding or testing against non-English header and signature lines if this parser is used globally.


31-59: Fragment class implementation looks good.
The approach for reversing lines and storing them in content is straightforward and efficient. This class is well-defined for capturing email segments.


162-174: Validate URL handling errors in replaceLinks.
Although you handle parsing errors with a try/catch, confirm that edge cases (e.g., unusual domain formats or partial matches) are correctly surfaced or sanitized.

server/distBE/integrations/google/pdf-utils.js (1)

1-5: Excellent setup of logging and error imports.
The import structure is well-organized, and leveraging Subsystem.Integrations in the logger context provides clear debug context.

server/distBE/ai/provider/openai.js (2)

1-10: Robust import structure and class definition.
The setup for OpenAIProvider properly extends the BaseProvider and references modelDetailsMap. No concerns found here.


63-97: Handle potentially undefined usage within chunked responses.
If chunk.usage is unexpectedly absent, referencing chunk.usage.prompt_tokens will cause errors. Include a safety check or default.

if (chunk.usage && !costYielded) {
  costYielded = true;
  yield {
    text: "",
    metadata: "",
-   cost: calculateCost({
-     inputTokens: chunk.usage.prompt_tokens,
-     outputTokens: chunk.usage.completion_tokens,
-   }, modelDetailsMap[modelParams.modelId].cost.onDemand),
+   cost: chunk.usage
+     ? calculateCost({
+         inputTokens: chunk.usage.prompt_tokens || 0,
+         outputTokens: chunk.usage.completion_tokens || 0,
+       }, modelDetailsMap[modelParams.modelId].cost.onDemand)
+     : 0,
  };
}
server/distBE/ai/provider/together.js (2)

23-23: Handle empty v.content arrays.

Here, we assume v.content always has an element at index 0. If v.content is an empty array, attempting to access v.content[0].text will throw an error. Consider adding a fallback or checking for an empty array:

...messages.map((v) => ({
-  content: v.content ? v.content[0].text : "",
+  content: v.content && v.content[0] ? v.content[0].text : "",
   role: v.role,
}))

34-36: Check for empty choices in the response.

Accessing response.choices[0] might throw if choices is an empty array or missing. Consider a defensive check:

- text: response.choices[0].message?.content || "",
+ text: response.choices?.[0]?.message?.content ?? "",
server/distBE/api/admin.js (1)

126-128: Validate form fields for service connections.

Accessing form["service-key"].text() and form.email can fail if these fields are missing or not properly formatted. Consider validating these properties before use to avoid runtime errors.

server/distBE/integrations/google/sync.js (6)

81-126: Validate concurrency handling for sheet deletions.
deleteUpdateStatsForGoogleSheets iterates over potential missing sheets and deletes them. If multiple sync jobs run concurrently, race conditions may arise when deleting or adding sheets. Consider adding locks or additional checks to handle concurrent sync tasks safely.


127-169: Confirm permission assumptions for spreadsheet deletion.
This approach deletes all sheets if a single user with sole permission has lost access. Verify that the user’s email always matches permissions[0]—otherwise, we throw an error. If any mismatch can happen in practice, handle partial removal or user mismatch more gracefully.


333-353: Graceful handling of null return.
getDriveChanges can return null if an error occurs. Ensure the calling code thoroughly checks for null to avoid potential runtime errors or missed updates.


877-997: Verify handling of label updates.
When labels are added/removed, code merges or filters them. Ensure concurrency does not cause the same label changes to be processed twice.


1367-1374: Initialization function looks good.
newStats() is a straightforward helper for counters, keeps code DRY.


1375-1381: Stats merging is efficient.
mergeStats is concise and meets typical aggregator needs.

server/distBE/integrations/google/gmail-worker.js (2)

49-69: Worker event logic is inactive if lines 1–23 remain commented out.
Ensure the necessary imports (e.g., Logger, MessageTypes) are uncommented or provided if self.onmessage is to function fully. Otherwise, the code may reference undefined objects like Logger.


288-317: Recursive body extraction is good.
getBody gracefully recurses through nested parts. This is a robust approach for multi-part emails. Keep an eye on extremely deep nesting to avoid potential stack issues, but it’s likely acceptable for typical email structures.

server/distBE/integrations/google/gmail/index.js (1)

110-231: Robust error class usage.
EmailParsingError is thrown if From is missing. Great approach. Also consider adding additional metadata for debugging (like the raw header contents).

server/distBE/ai/provider/fireworksClient.js (2)

12-45: Good use of AbortController with a timeout.

The _makeRequest method correctly uses an AbortController to handle timeouts, ensuring the request is canceled to prevent resource leaks or indefinite hangs. This is an excellent pattern for robust HTTP requests.


72-124: Verify partial JSON chunk handling in streaming.

The streamComplete generator reads and buffers data chunks until encountering a newline, then processes each line. Storing unused parts in buffer avoids splitting JSON mid-chunk. This is generally correct, but be aware that malformed JSON or large partial lines may still cause irregularities. Logging any parse failures is likewise helpful.

server/distBE/utils.js (4)

10-27: Great file existence check with fallback.

checkAndReadFile efficiently checks for file existence and returns null for missing files. This pattern gracefully handles optional configuration files. Implementation looks good.


43-57: Cookie options for non-production environments.

setCookieByEnv properly sets a less secure cookie for non-production. Confirm that no sensitive data is passed to front-ends in these environments.

Are there usage scenarios where you might want HTTP-only cookies in staging or test environments? Consider whether additional warnings or environment checks are needed.


133-159: Effective text cleanup utilities.

The text-processing functions (splitGroupedCitationsWithSpaces, removeStopwords) appropriately handle edge cases like empty or non-string inputs and invalid tokens. No issues spotted here.


92-132:

❓ Verification inconclusive

Recursive retry logic is effective; ensure googleOauth2Client is always defined.

The retryWithBackoff function implements exponential backoff nicely. Pay attention to the path where app is a Google app but googleOauth2Client is undefined; it will cause an error at line 119. Ensure upstream code always supplies the OAuth2 client for that flow.


🏁 Script executed:

#!/bin/bash
# Search for calls to retryWithBackoff where googleOauth2Client isn't passed
rg --no-heading 'retryWithBackoff\([^,]*,[^,]*,Apps\.GoogleDrive\)'

Length of output: 69


Action: Verify that a valid googleOauth2Client is always provided for Google app flows.

The recursive retry logic implementation in server/distBE/utils.js looks solid overall. However, the branch handling 401 errors for Google apps (line 119) depends on a defined googleOauth2Client to refresh the OAuth token. Although our automated search for calls using Apps.GoogleDrive didn’t produce output, this narrow query might miss other valid Google app invocations or different naming patterns. Please verify manually that every call to retryWithBackoff for a Google app passes a properly defined OAuth2 client.

  • Review the upstream invocations where IsGoogleApp(app) evaluates to true and confirm that googleOauth2Client is always non‑undefined.
  • Ensure that missing this parameter is caught early to avoid errors when the code reaches line 119.
server/distBE/ai/context.js (3)

1-35: Consistent approach to constructing context for file and user entities.

Both constructFileContext and constructUserContext share a consistent style (listing key fields, using getRelativeTime for timestamps). This is clear and maintainable. Good job.


36-51: Dynamic summary chunk slicing.

constructMailContext uses fields.chunks_summary.slice(0, maxSummaryChunks), with a fallback for maxSummaryChunks. This logic is straightforward. Ensure that your upstream code handles situations where chunks_summary might be undefined or empty.


212-230: Validate additional searchResult.fields.sddocname types.

answerContextMap handles several known schemas. Confirm if you need a fallback for newly introduced schema types or if throwing an error is the desired behavior. This will help avoid unexpected crashes when new entity types are added.

server/distBE/types.js (2)

10-15: Consider more robust type transformations for boolean fields.
While using x === "true" works, it may silently fail for unexpected input (e.g., uppercase "TRUE"). Consider a stricter check or using .refine() to ensure only valid boolean strings are accepted.

Would you like me to verify the usage of groupCount transformations elsewhere to confirm correctness?


32-35:

❓ Verification inconclusive

Verify partial usage of app and entity fields.
The .refine() requirement ensures app and entity must appear together. Double-check that no callers pass only one of them unintentionally.


🏁 Script executed:

#!/bin/bash
# Confirm that all calls either set both 'app' and 'entity' or neither
ast-grep --pattern $'searchSchema(parse($_))' 

Length of output: 46


Manual Verification Required for Partial app/entity Usage
The schema’s .refine() logic correctly mandates that both app and entity must be provided together. However, our initial automated search didn't yield any usage instances, which might be due to the search pattern not matching all call formats. Please manually verify that no caller inadvertently supplies only one of these fields.

  • Action Required:
    Manually review all usages of searchSchema (or its .parse/validation calls) to ensure that both app and entity are always provided together or omitted altogether.
server/distBE/db/user.js (2)

13-31: Confirm handling of soft-deleted records.
The function retrieves user and workspace records without filtering out any possible deletedAt values. If soft deletes exist, consider filtering them out if that aligns with your requirements.


89-99: Inconsistent not-found handling with other queries.
Here we throw an error if the user doesn’t exist, and parse with selectUserSchema. This is good. Consider aligning other getter methods (e.g., getUserByEmail) with this approach for consistent error handling and schema validation.

server/distBE/db/chat.js (3)

12-20: Validate inserted chat.
The code inserts a chat and then parses only the first returned row. This is fine for single inserts, but consider handling unexpected empty arrays with a more descriptive error or fallback.


29-35: Consistent error messages.
When the chat is not found, the error message is “Chat not found.” This is consistent, but ensure your consumer can differentiate between different resource types if more complex handling is needed.


78-88: Consistent updatedAt logic.
updateMessageByExternalId sets updatedAt to the current date. If other update functions do the same, confirm uniform usage and handling of these timestamps.

server/distBE/ai/types.js (5)

4-12: Naming and grouping providers.
Your AIProviders enum looks well-defined. If you plan to add new providers frequently, consider grouping them logically or establishing a pattern for expansions.


50-54: Add optional usage note for mention arrays.
mentionedNames and mentionedEmails might grow large. If you plan to store or process them further, consider performance or memory aspects.


55-59: Rewrite schema coverage.
rewrittenQueries is optional. Ensure that your logic gracefully handles cases where it’s absent, especially if subsequent code expects rewrites to exist.


67-72: Uniform time filtering approach.
In FiltersSchema, you allow optional startTime/endTime. Make sure the consumer code can handle when only one of these is provided or neither.


81-97: Document the necessity for discriminated union.
It’s clear you’re using “type” as the key. Just ensure your devs know how to handle newly added query types, as they’ll need to extend this union.

server/distBE/integrations/google/utils.js (1)

153-190: Consider validating or sanitizing PDF file names prior to use.

If pdfFile?.name can contain special characters or path manipulations, it might pose a risk. Sanitizing or validating the filename ensures safe file operations.

server/distBE/api/chat.js (2)

140-158: Validate that both messages and the chat record exist before deletion.

Currently, ChatDeleteApi proceeds with deletes even if the chat or messages do not exist. Consider verifying their existence or adjusting the approach if a missing chat is a valid scenario.


792-888: Guard against ambiguous partial JSON parsing.

When accumulating chunks in the SSE stream, partial JSON can trigger parse errors. This logic waits to parse until the buffer is complete, but confirm that large or chunked responses don't cause partial parse attempts.

server/distBE/shared/types.js (1)

134-153: Perform deeper validation for file-related data.

FileResponseSchema picks and extends some essential fields. Consider validating optional fields (e.g., mimeType) to ensure they are valid content types, or handle them gracefully in the calling code.

server/distBE/db/schema.js (2)

38-38: Potential issue with unique constraint on createdBy
The createdBy field is marked as unique, which will restrict multiple workspaces from being created by the same user. This might not be the intended functionality if a single user can create multiple workspaces.

Do we expect only one workspace per creator? If not, consider removing or revisiting the unique constraint.


118-120: Validate uniqueness of combination
The uniqueConnector constraint enforces a unique combination of (workspaceId, userId, app, authType). Verify that this aligns with your requirement, especially if a user could need multiple connector instances for the same app and auth type.

server/distBE/search/vespaClient.js (1)

19-44: Confirm Node fetch availability
This file uses the global fetch API, which is available in Node v18+ by default. If you plan to support older Node versions, ensure a polyfill like node-fetch is installed and imported.

server/distBE/search/vespa.js (1)

481-481: Potential undefined access for docExist.fields
In updateUserQueryHistory, you access docExist.fields without null checks. If docExist.fields is unexpectedly undefined, this could throw an error.

Would you like a defensive check to skip the logic if docExist.fields is missing?

server/distBE/ai/provider/index.js (1)

231-291: Robustness concerns in JSON parsing
jsonParseLLMOutput() aggressively manipulates the returned text to isolate JSON. While this is a good approach to handle unpredictable responses, consider carefully validating the resulting structure for security and correctness. Large or malformed JSON could cause performance issues or unhandled parsing errors.

Do you have unit tests that ensure this function behaves as expected when the text has extreme or deliberately malformed content? If not, I can provide a test script for verification.

server/distBE/ai/prompts.js (1)

8-12: Assess privacy implications of user context
The prompt askQuestionUserPrompt includes the user context, which might contain personal data (e.g., name, email). Confirm that exporting or echoing sensitive user details in the system prompts won't pose any privacy or compliance issues.

server/distBE/errors/index.js (1)

34-36: IntegrationOp enum declared but never used
The IntegrationOp enum is declared but not referenced in subsequent classes or logic, which can lead to confusion. If it’s intended for future use, consider leaving a comment about its purpose or removing it until needed.

server/distBE/integrations/google/index.js (2)

1568-1646: Extend coverage for empty contact data scenarios.

In listAllContacts(), an empty or partial People API response might indicate account-level restrictions or missing permissions. Currently, the code loops until exhaustion. Consider adding a fallback or an alert mechanism if the user or domain typically has contacts but yields an unexpectedly empty response.

Do you want me to generate a script that queries logs or user IDs to confirm whether these partial responses occur frequently?


1719-1748: Validate contact mapping logic for special cases.

insertContactsToVespa() attempts to handle partial or missing IDs/emails gracefully. Ensure specialized scenarios—like distribution lists without a unique email—are documented or filtered out. This helps maintain consistent indexing without accidental collisions in doc IDs.

server/distBE/search/types.js (2)

104-118: Validate optional fields in VespaFileSchema.

Some fields (url, owner, ownerEmail, photoLink, mimeType) are marked nullable or optional. Ensure your indexing or searching logic can handle these null values gracefully (e.g., no unexpected string conversions, logs, or search indexing issues).


130-162: Enhance robust validation for user schemas.

email is mandatory, but name is optional. If a user has no name in certain contexts (e.g. external user), ensure your UI or queries handle the undefined name field properly. Consider a fallback or label if it's missing.

@@ -0,0 +1 @@
export {};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix filename typo and add actual constants.

The filename appears to be misspelled - "contants.js" instead of "constants.js". Additionally, the file only exports an empty object with no actual constants defined.

Consider:

  1. Renaming the file to "constants.js"
  2. Adding the actual AI-related constants that will be used throughout the application
-export {};
+export const DEFAULT_MODEL = "gpt-3.5-turbo";
+export const DEFAULT_TEMPERATURE = 0.7;
+export const MAX_TOKENS = 4096;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export {};
export const DEFAULT_MODEL = "gpt-3.5-turbo";
export const DEFAULT_TEMPERATURE = 0.7;
export const MAX_TOKENS = 4096;

Comment on lines 34 to 44
const attributesToRemove = [
"style",
"src",
"alt",
"title",
"role",
"aria-",
"tabindex",
"on",
"data-",
];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Reevaluate removing accessibility attributes
Eliminating aria- attributes can break accessibility for screen readers and other assistive technologies. Consider retaining critical ARIA attributes or implementing a whitelist approach.

Comment on lines 32 to 39
const getSchemaDefinitionPaths = (directory) => {
return fs
.readdirSync(directory)
.filter((file) => path.extname(file) === ".sd")
.map((file) => path.join(directory, file));
};
const paths = getSchemaDefinitionPaths("./schemas");
replaceModelDIMS(paths, parseInt(args[0]));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Ensure directory existence
If ./schemas is missing or empty, the script might fail silently or produce no changes. Confirm the directory and file presence before proceeding.


🏁 Script executed:

#!/bin/bash
if [ -d "./schemas" ]; then
  ls -la ./schemas
else
  echo "No schemas directory found."
fi

Length of output: 87


Action Required: Add Directory Existence Check

After running the verification script, it’s confirmed that the expected "./schemas" directory does not exist. This can lead to a situation where the script fails silently or produces no changes when it attempts to read from a non-existent directory. To resolve this issue, please update the code in server/distBE/vespa/replaceDIMS.js (lines 32-39) to explicitly check for the directory’s existence (and optionally its non-emptiness) before proceeding. For example:

if (!fs.existsSync("./schemas")) {
  console.error("Error: The './schemas' directory does not exist.");
  process.exit(1);
}

Then proceed with:

const paths = getSchemaDefinitionPaths("./schemas");
replaceModelDIMS(paths, parseInt(args[0]));

This update will ensure that missing or empty directories are caught early and handled gracefully.

Comment on lines 390 to 521
nextPageToken = res.data.nextPageToken ?? ""
newSyncTokenCalendarEvents = res.data.nextSyncToken ?? ""
} while (nextPageToken)
if (events.length === 0) {
return { events: [], calendarEventsToken: newSyncTokenCalendarEvents }
}
const confirmedEvents = events.filter((e) => e.status === "confirmed")
// Handle cancelledEvents separately
const cancelledEvents = events.filter((e) => e.status === "cancelled")
// First insert only the confirmed events
for (const event of confirmedEvents) {
const { baseUrl, joiningUrl } = getJoiningLink(event)
const { attendeesInfo, attendeesEmails, attendeesNames } =
getAttendeesOfEvent(event.attendees ?? [])
const { attachmentsInfo, attachmentFilenames } = getAttachments(
event.attachments ?? [],
)
const { isDefaultStartTime, startTime } = getEventStartTime(event)
const eventToBeIngested = {
docId: event.id ?? "",
name: event.summary ?? "",
description: getTextFromEventDescription(event?.description ?? ""),
url: event.htmlLink ?? "", // eventLink, not joiningLink
status: event.status ?? "",
location: event.location ?? "",
createdAt: new Date(event.created).getTime(),
updatedAt: new Date(event.updated).getTime(),
app: Apps.GoogleCalendar,
entity: CalendarEntity.Event,
creator: {
email: event.creator?.email ?? "",
displayName: event.creator?.displayName ?? "",
},
organizer: {
email: event.organizer?.email ?? "",
displayName: event.organizer?.displayName ?? "",
},
attendees: attendeesInfo,
attendeesNames: attendeesNames,
startTime: startTime,
endTime: new Date(event.end?.dateTime).getTime(),
attachmentFilenames,
attachments: attachmentsInfo,
recurrence: event.recurrence ?? [], // Contains recurrence metadata of recurring events like RRULE, etc
baseUrl,
joiningLink: joiningUrl,
permissions: getUniqueEmails([
event.organizer?.email ?? "",
...attendeesEmails,
]),
cancelledInstances: [],
defaultStartTime: isDefaultStartTime,
}
await insert(eventToBeIngested, eventSchema)
updateUserStats(userEmail, StatType.Events, 1)
}
// Add the cancelled events into cancelledInstances array of their respective main event
for (const event of cancelledEvents) {
// add this instance to the cancelledInstances arr of main recurring event
// don't add it as seperate event
const instanceEventId = event.id ?? ""
const splittedId = instanceEventId?.split("_") ?? ""
const mainEventId = splittedId[0]
const instanceDateTime = splittedId[1]
try {
// Get the main event from Vespa
// Add the new instanceDateTime to its cancelledInstances
const eventFromVespa = await GetDocument(eventSchema, mainEventId)
const oldCancelledInstances =
eventFromVespa.fields.cancelledInstances ?? []
if (!oldCancelledInstances?.includes(instanceDateTime)) {
// Do this only if instanceDateTime not already inside oldCancelledInstances
const newCancelledInstances = [
...oldCancelledInstances,
instanceDateTime,
]
if (eventFromVespa) {
await UpdateEventCancelledInstances(
eventSchema,
mainEventId,
newCancelledInstances,
)
}
}
} catch (error) {
Logger.error(
error,
`Main Event ${mainEventId} not found in Vespa to update cancelled instance ${instanceDateTime} of ${instanceEventId}`,
)
}
}
if (!newSyncTokenCalendarEvents) {
throw new CalendarEventsListingError({
message: "Could not get sync tokens for Google Calendar Events",
integration: Apps.GoogleCalendar,
entity: CalendarEntity.Event,
})
}
return { events, calendarEventsToken: newSyncTokenCalendarEvents }
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider chunking or streaming large event batches to reduce memory usage.

insertCalendarEvents() fetches and processes potentially thousands of events up to a year in advance. Consider partial insertion or streaming patterns—especially when events over multiple years or multiple user calendars are handled—to avoid large array concatenations in memory.

 const confirmedEvents = events.filter((e) => e.status === "confirmed");
-// current approach: for-of iteration on entire confirmedEvents array
+// suggested approach: process in batches to avoid storing large arrays in memory

Committable suggestion skipped: line range outside the PR's diff.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
server/shared/config.ts (1)

1-3: Good addition of dotenv for configuration management

The dotenv implementation allows the serverUrl to be configurable through environment variables, enhancing flexibility. This is consistent with the pattern used across the project.

Consider using a server-specific naming convention for environment variables instead of the React-specific REACT_APP_ prefix, as this is server-side code.

server/db/schema.ts (1)

27-28: Appropriate use of dotenv for critical environment variables

Adding dotenv here is especially important as this file handles database schema and encryption keys. The implementation follows a fail-fast approach by checking for required environment variables immediately after loading.

There's a slight redundancy in line 30 where you use the non-null assertion operator (!) and then explicitly check if the variable is not set in lines 31-33. Consider using a consistent approach, either with assertion or explicit checks, but not both.

server/nodeServer.ts (2)

89-110: Watch for potential redirect loops when token is invalid.

If AuthRedirect determines the token is invalid (lines 102-109), it delegates back to /auth, which might route control back here if not carefully handled. Ensure no cyclical redirection occurs for certain edge cases (e.g., partially expired tokens).


386-389: Evaluate graceful shutdown strategies for uncaught exceptions.

Currently, an uncaught exception logs the error but does not stop or restart the server. Consider gracefully shutting down or automatically restarting to avoid the system staying in an unknown state.

server/integrations/google/index.ts (2)

114-116: Bundle environment variable loading in a single place.

You import dotenv and call dotenv.config() again on lines 115-116, possibly duplicating usage from other files. Re-initializing environment variables multiple times typically has no side effects, but consider centralizing the logic to reduce confusion and possibility of misconfigurations.


123-128: Abstract environment detection logic for clarity.

You rely on typeof process !== "undefined" && !("Bun" in globalThis) to decide between NodeWorker and the default Worker. This can be brittle if other runtimes differ in their global definitions, or if additional JS environments (e.g., Deno) are introduced. Consider a helper function that detects the runtime environment, making the code more scalable and explicit.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d461a8 and 4d70c47.

📒 Files selected for processing (10)
  • server/ai/provider/index.ts (1 hunks)
  • server/config.ts (1 hunks)
  • server/db/schema.ts (1 hunks)
  • server/integrations/google/gmail-worker.ts (1 hunks)
  • server/integrations/google/gmail/index.ts (1 hunks)
  • server/integrations/google/index.ts (4 hunks)
  • server/nodeServer.ts (1 hunks)
  • server/package.json (3 hunks)
  • server/shared/config.ts (1 hunks)
  • server/utils.ts (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • server/integrations/google/gmail/index.ts
  • server/integrations/google/gmail-worker.ts
  • server/ai/provider/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/package.json
🧰 Additional context used
🧬 Code Definitions (2)
server/nodeServer.ts (1)
server/utils.ts (1)
  • setCookieByEnv (49-67)
server/integrations/google/index.ts (1)
server/logger/index.ts (1)
  • getLogger (24-49)
🔇 Additional comments (5)
server/utils.ts (1)

10-11: Good addition of dotenv for environment variable management

Adding dotenv to load environment variables from a .env file is a good practice for Node.js applications. This makes the codebase more configurable and aligns with the PR objective of making server JS files run in Node.

server/config.ts (1)

2-3: Well-implemented dotenv configuration

The addition of dotenv is particularly valuable in this configuration file, which heavily relies on environment variables for model configuration, host settings, and API credentials.

server/nodeServer.ts (3)

72-72: Consider validating environment variable presence before using ! operator.

Using ! (non-null assertion) on process.env.JWT_SECRET can cause runtime errors if the variable is missing. You may opt to throw an error or provide a fallback when process.env.JWT_SECRET is undefined.

Would you like a shell script to search for all similar non-null uses of environment variables and flag them?


80-83: Reevaluate JWT-based middleware usage for OAuth callback endpoints.

You are applying AuthMiddleware to routes like /oauth/callback; typically, OAuth callbacks do not require existing credentials. If the user has no token yet, this might cause unwanted redirects. Verify that this flow doesn’t block first-time sign-ins.


312-315: Ensure built frontend files exist at the specified paths.

Routes serving static files (e.g., "/chat", "/auth") reference ./dist/index.html. Double-check that these files exist in production builds and that environment-based paths or an SSR approach won't be needed.

Comment on lines 1644 to 1651
let writer;
if (typeof process !== "undefined" && !("Bun" in globalThis)) {
// Node.js environment
writer = fs.createWriteStream(filePath)
} else {
const file = Bun.file(filePath)
writer = file.writer()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Mirror the runtime detection for file-writing logic.

The conditional creation of a fs.createWriteStream vs. Bun.file is consistent with how you differentiate worker usage. However, ensure concurrency constraints (e.g., simultaneous writes) are handled correctly. Also consider whether fallback logic is needed for other runtimes that are neither Node nor Bun.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
server/integrations/google/index.ts (1)

118-119: Remove @ts-ignore and use proper typing for the html-to-text import

The @ts-ignore comment suggests there might be a type issue with the htmlToText import. Consider resolving the type issue properly instead of suppressing it.

-// @ts-ignore
-import * as htmlToText from "html-to-text"
+import { convert } from "html-to-text"

Later in the code, you can replace htmlToText.convert() with just convert().

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d70c47 and 2e256a5.

📒 Files selected for processing (6)
  • server/config.ts (1 hunks)
  • server/db/schema.ts (1 hunks)
  • server/integrations/google/gmail/index.ts (1 hunks)
  • server/integrations/google/index.ts (3 hunks)
  • server/package.json (3 hunks)
  • server/utils.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • server/config.ts
  • server/utils.ts
  • server/integrations/google/gmail/index.ts
  • server/package.json
  • server/db/schema.ts
🧰 Additional context used
🪛 GitHub Actions: TypeScript Build Check
server/integrations/google/index.ts

[error] 833-833: error TS2339: Property 'onmessage' does not exist on type 'Worker | Worker'.

🔇 Additional comments (2)
server/integrations/google/index.ts (2)

113-115: Added Node.js compatibility with worker_threads and dotenv

The changes add support for running in a Node.js environment by importing the Node.js worker implementation and adding dotenv configuration. This aligns with the PR objective to run server JS files in Node.


1717-1725: Mirror the runtime detection for file-writing logic

The conditional creation of a fs.createWriteStream vs. Bun.file is consistent with how you differentiate worker usage.

Comment on lines 122 to 127
let gmailWorker
if (typeof process !== "undefined" && !("Bun" in globalThis)) {
gmailWorker = new NodeWorker(new URL("./gmail-worker.js", import.meta.url))
} else {
gmailWorker = new Worker(new URL("gmail-worker.ts", import.meta.url).href)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Runtime detection for worker creation

Good approach for conditional worker creation based on the runtime environment. However, the NodeWorker URL path points to a JS file while the fallback uses a TS file. Make sure the compilation process generates both files correctly.

Verify that the compiled JS file exists in the right location:


🏁 Script executed:

#!/bin/bash
# Check if the JS worker file exists
if [ -f server/integrations/google/gmail-worker.js ]; then
  echo "✓ JS worker file exists"
else
  echo "✗ JS worker file doesn't exist"
fi

# Check if the TS worker file exists
if [ -f server/integrations/google/gmail-worker.ts ]; then
  echo "✓ TS worker file exists"
else
  echo "✗ TS worker file doesn't exist"
fi

Length of output: 242


JS Worker File Missing for Node Environment

  • The runtime detection logic is correctly distinguishing between environments.
  • However, the compiled JS version (gmail-worker.js) is missing in the expected location (server/integrations/google), while the TS file exists.
  • Please review your build process to ensure that gmail-worker.ts is compiled to generate gmail-worker.js for non-Bun environments or adjust the instantiation accordingly.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
server/nodeServer.ts (4)

90-111: Potential for XSRF concerns with cookie-based auth.
Redirect-based checks are straightforward, but consider implementing CSRF or state-based tokens (especially on form submissions) to further mitigate cross-site request forgery.


170-193: Revisit the 2-month JWT expiry.
The token expiration is set to 60 days, which might be too long for higher-security contexts. Consider an environment-based or user-configurable token lifespan.

Example shorter expiry (e.g. 24 hours):

- exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 60
+ exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24

310-335: Potential route consolidation for static files.
The code individually registers routes for "/", "/chat", "/auth", etc., all serving the same index.html. You could consolidate these definitions to reduce repetition if the requirement to redirect or apply distinct middleware is minimal.


386-389: Finalize graceful shutdown plan.
The uncaughtException handler logs errors but does not attempt a graceful shutdown. Consider adding a cleanup routine or SIGTERM listener.

+ process.on("SIGTERM", async () => {
+   Logger.warn("Received SIGTERM, starting graceful shutdown...")
+   server.close(() => {
+     Logger.info("Server closed. Exiting...")
+     process.exit(0)
+   })
+ })
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2e256a5 and b42cbca.

📒 Files selected for processing (3)
  • server/integrations/slack/index.ts (2 hunks)
  • server/nodeServer.ts (1 hunks)
  • server/package.json (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • server/integrations/slack/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/package.json
🧰 Additional context used
🧬 Code Definitions (1)
server/nodeServer.ts (1)
server/utils.ts (1)
  • setCookieByEnv (49-67)
🔇 Additional comments (5)
server/nodeServer.ts (5)

1-14: Imports look good overall.
The module imports appear to be organized and relevant. No immediate issues or duplicates are detected.


67-74: Ensure critical environment variables are defined.
The code relies on process.env.GOOGLE_CLIENT_ID, process.env.GOOGLE_CLIENT_SECRET, and process.env.JWT_SECRET with non-null assertions, which will throw an error at runtime if the environment variables are missing.

Please consider verifying these variables on startup or providing fallback/error messaging. For instance:

if (!process.env.GOOGLE_CLIENT_ID || !process.env.GOOGLE_CLIENT_SECRET || !process.env.JWT_SECRET) {
  throw new Error("Missing required environment variables.")
}

80-83: Cookie-based JWT approach looks consistent.
Using cookie: CookieName in the jwt middleware effectively checks for tokens. Ensure secure cookie settings are enforced in production (e.g., secure, httpOnly) at a global level if not already handled.


204-309: Verify domain logic & user creation flow.
The code gracefully handles existing users and workspaces, then creates new ones if not found. This seems correct, but ensure you have test coverage for edge cases (e.g., invalid or non-corporate email domains, partial user data).

To confirm correctness, you could run a specialized test script or unit/integration tests that simulate various login scenarios with different hosted domains and user states.


361-382: Re-examine WebSocket connection logic.
When multiple clients share the same connector ID, the wsConnections.set(connectorId, ws) call will overwrite existing WebSocket references. Confirm whether this is intentional or if you need to handle multiple simultaneous connections per connector.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
server/integrations/google/gmail-worker.ts (2)

31-32: Remove or justify the //@ts-ignore usage.

Currently, //@ts-ignore bypasses type checking for the htmlToText import. Consider installing or configuring type definitions so you can safely remove the ignore comment and ensure type-safety throughout the file.


88-91: Confirm completeness of error handling for Node.js worker.

You're logging the worker's errors, which is a good start. Consider whether these errors need to bubble up or trigger any specific fallback or cleanup actions. If not, this is fine as is.

server/integrations/google/index.ts (2)

118-119: Avoid //@ts-ignore if possible.

Similar to gmail-worker.ts, installing proper type definitions or refactoring the import for "html-to-text" would make the code more robust.


1746-1753: Cross-environment file writer logic is well-defined.

Switching to fs.createWriteStream in Node or Bun.file(...).writer() otherwise facilitates a flexible approach. Confirm large file writes and concurrency are tested in both environments to avoid partial writes or performance bottlenecks.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b42cbca and c24d7ed.

📒 Files selected for processing (2)
  • server/integrations/google/gmail-worker.ts (3 hunks)
  • server/integrations/google/index.ts (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/integrations/google/index.ts (1)
server/logger/index.ts (1)
  • getLogger (24-49)
🔇 Additional comments (6)
server/integrations/google/gmail-worker.ts (3)

44-44: Node.js Worker import looks good.

The import from "worker_threads" is necessary for Node.js execution. Just confirm that the rest of this file is not bundled for non-Node runtimes, to avoid runtime exceptions.


69-87: Validate parentPort usage in Node environment.

Using parentPort inside the if (typeof process !== "undefined" && !("Bun" in globalThis)) check is correct for a Node-based worker. Ensure no messages are lost if parentPort is unexpectedly null or undefined. Additionally, confirm that the rest of the code handles concurrency or message queuing as needed for large-scale message processing.


92-116: Fallback to self.onmessage in non-Node environments looks correct.

The browser-like logic with self.onmessage and self.onerror is consistent. Just ensure that no race conditions occur when switching between environment checks. If you ever expand to additional runtimes (e.g., Deno), revisit this guard.

server/integrations/google/index.ts (3)

113-115: New imports for Node worker and environment config appear correct.

Using Worker as NodeWorker from "worker_threads" and loading .env is standard for Node environments. Ensure .env usage is idempotent if this file is imported multiple times.


122-139: Conditional worker logic for Node vs. Bun vs. browser is well-structured.

This ensures that a .js worker is used in Node and a .ts worker is used otherwise. Verify that the build process reliably generates gmail-worker.js in the same folder, preventing file-not-found issues in production.


839-876: Worker event handling is consistent with gmail-worker.ts.

Centralizing the logic in setupGmailWorkerHandler is good design. Confirm concurrency or parallel usage is handled gracefully (e.g., if multiple trackers attach simultaneously). Otherwise, this looks healthy.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
server/nodeServer.ts (7)

1-69: Imports look well-structured, but consider organizing them better.

The imports are comprehensive and include all necessary dependencies for the Node server implementation. However, consider grouping related imports together (e.g., all Hono-related imports, all authentication-related imports) to improve readability.


68-69: Consider moving dotenv configuration to the beginning of the file.

For environment variables to be properly loaded before any code uses them, the dotenv configuration should ideally be at the very top of the file, before any imports that might use environment variables.

+import dotenv from "dotenv"
+dotenv.config()
 import { type Context, Hono, type Next } from "hono"
 import {
   AnswerApi,
   AutocompleteApi,
   // ... other imports
 }
-import dotenv from "dotenv"
-dotenv.config()

210-210: Consider reducing JWT token expiration time.

The current token expiration is set to 60 days, which is quite long from a security perspective. Consider reducing this to a more reasonable timeframe (e.g., 1-7 days) and implementing a refresh token mechanism.

-  exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 60, // Token expires in 2 months
+  exp: Math.floor(Date.now() / 1000) + 60 * 60 * 24 * 7, // Token expires in 7 days

248-249: Remove @ts-ignore and properly type the user object.

Using @ts-ignore suppresses TypeScript errors which might hide potential issues. Instead, properly type the user object to handle the hd property.

-    // @ts-ignore
-    let domain = user.hd
+    // Define a more complete type for the Google user object
+    let domain = (user as { hd?: string }).hd

357-362: Handle initialization errors more gracefully.

The current initialization immediately throws an error on failure, which would crash the application. Consider implementing a more graceful error handling approach.

 export const init = async () => {
   await initQueue()
 }
-init().catch((error) => {
-  throw new InitialisationError({ cause: error })
-})
+init().catch((error) => {
+  Logger.error(
+    error,
+    "Failed to initialize the application. Attempting to continue with limited functionality."
+  )
+  // Consider implementing fallback behavior or graceful degradation
+})

407-410: Implement graceful shutdown for uncaught exceptions.

The current error handler for uncaught exceptions only logs the error. Consider implementing a graceful shutdown mechanism to prevent the application from continuing in an inconsistent state.

 process.on("uncaughtException", (error) => {
   Logger.error(error, "uncaughtException")
-  // shutdown server?
+  // Attempt to close resources gracefully
+  Logger.info("Initiating graceful shutdown due to uncaught exception")
+  
+  // Close the WebSocket server
+  wss.close(() => {
+    Logger.info("WebSocket server closed")
+    
+    // Close the HTTP server
+    server.close(() => {
+      Logger.info("HTTP server closed")
+      
+      // Exit the process with an error code
+      process.exit(1)
+    })
+  })
+  
+  // Force exit if graceful shutdown takes too long
+  setTimeout(() => {
+    Logger.error("Forced shutdown due to timeout")
+    process.exit(2)
+  }, 10000) // 10 seconds timeout
 })

1-411: Consider breaking down this large file into smaller modules.

This file handles multiple responsibilities including routing, authentication, WebSocket handling, and server initialization. Consider splitting it into smaller, more focused modules to improve maintainability.

Potential modules:

  • auth.ts - Authentication middleware and logic
  • routes.ts - API routes configuration
  • websocket.ts - WebSocket server setup and handlers
  • server.ts - Main server initialization and configuration
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c24d7ed and 1ce3563.

📒 Files selected for processing (4)
  • .gitignore (1 hunks)
  • server/nodeServer.ts (1 hunks)
  • server/package.json (4 hunks)
  • server/tsconfig.json (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/tsconfig.json
  • server/package.json
🧰 Additional context used
🧬 Code Definitions (1)
server/nodeServer.ts (1)
server/utils.ts (1)
  • setCookieByEnv (49-67)
🔇 Additional comments (1)
.gitignore (1)

122-122: Appropriate addition for server build output directory.

The addition of distServer to .gitignore is consistent with the project structure and aligns with the PR objective to run server JS files in Node. This directory will contain the compiled server-side TypeScript files.

@kalpadhwaryu kalpadhwaryu changed the title Run Server JS files in Node feat(server): add support for running server in Node Apr 2, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
server/server.ts (4)

67-73: Consider loading environment variables earlier.
Calling dotenv.config() near the top of the file is a common convention to ensure environment variables are available before usage.


345-345: Handle missing file scenario.
If ./dist/index.html doesn’t exist at runtime, serving will fail silently. Consider adding a fallback or error handling for production robustness.


348-352: Static serving repeated across routes.
You may reduce duplication by consolidating routes that serve the same index.html file under a single pattern-based handler.


447-480: Unify Bun vs Node WSS logic and consider authentication.
Both blocks duplicate WebSocket setup code. Optionally consolidate into a shared utility and add authentication checks for consistent security, matching the rest of the app’s Auth approach.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ce3563 and 7e8877c.

📒 Files selected for processing (1)
  • server/server.ts (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/server.ts (1)
server/integrations/metricStream.ts (1)
  • wsConnections (3-3)
🔇 Additional comments (9)
server/server.ts (9)

65-66: No issues with importing StatusCode.
It is used in the typed WebSocket route definitions below, so it’s valid to keep.


336-342: Check support for top-level await.
This dynamic import approach is valid if your runtime supports top-level await. Otherwise, consider refactoring to an async initialization function or ensuring an ESM environment.


347-347: Routing looks consistent.
No authentication is required on /auth as expected.


356-356: OK to serve index on chat route.
No evident issues.


361-362: No concerns here.
Auth is applied consistently on /integrations.


366-367: Matches the existing pattern.
Nothing unusual.


368-368: Serving /oauth/success is fine.
This route looks correct.


371-371: Asset serving is straightforward.
Ensure the ./dist folder is properly populated during build.


380-403: Verify usage of WsApp in Node environment.
WsApp is only assigned in the Bun branch. If this is intended, confirm that no references to WsApp exist for Node. Otherwise, consider a fallback assignment for Node.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
server/server.ts (1)

391-433: ⚠️ Potential issue

WebSocket connections in Node lack authentication.

This open /ws endpoint may be a security concern if your application expects authenticated clients. Implement authentication (e.g., checking tokens) before upgrading to websockets.

Consider adding authentication middleware similar to what's used for your HTTP routes:

server.on("upgrade", (req, socket, head) => {
  const { pathname } = parse(req.url, true)

  if (pathname === "/ws") {
+   // Extract and verify auth token from request
+   const cookies = req.headers.cookie?.split(';').reduce((acc, cookie) => {
+     const [key, value] = cookie.trim().split('=');
+     acc[key] = value;
+     return acc;
+   }, {});
+   
+   const authToken = cookies?.[CookieName];
+   
+   if (!authToken) {
+     socket.write('HTTP/1.1 401 Unauthorized\r\n\r\n');
+     socket.destroy();
+     return;
+   }
+   
+   try {
+     // Verify JWT token
+     const decoded = jwt.verify(authToken, jwtSecret);
+     // Store user info for later use
+     req.user = decoded;
      // If the request is for the /ws endpoint, upgrade the connection.
      wss.handleUpgrade(req, socket, head, (ws) => {
        wss.emit("connection", ws, req)
      })
+   } catch (err) {
+     socket.write('HTTP/1.1 401 Unauthorized\r\n\r\n');
+     socket.destroy();
+   }
  } else {
    // Otherwise, destroy the socket.
    socket.destroy()
  }
})
🧹 Nitpick comments (1)
server/server.ts (1)

367-389: Simplify the WebSocket type definition.

The type definition for WsApp is overly complex for its usage and contains duplicate types. Consider simplifying it for better maintainability.

-export let WsApp: Hono<
-  {
-    Variables: Variables
-  },
-  {
-    "/ws": {
-      $get:
-        | {
-            input: {}
-            output: {}
-            outputFormat: "ws"
-            status: StatusCode
-          }
-        | {
-            input: {}
-            output: {}
-            outputFormat: "ws"
-            status: StatusCode
-          }
-    }
-  },
-  "/"
->
+export let WsApp: Hono<
+  {
+    Variables: Variables
+  },
+  {
+    "/ws": {
+      $get: {
+        input: {}
+        output: {}
+        outputFormat: "ws"
+        status: StatusCode
+      }
+    }
+  },
+  "/"
+>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e8877c and 2448bb8.

📒 Files selected for processing (1)
  • server/server.ts (3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
server/server.ts (1)
server/integrations/metricStream.ts (1)
  • wsConnections (3-3)
🔇 Additional comments (3)
server/server.ts (3)

65-73: Good job adding Node.js environment support.

The imports for Node.js-specific packages and the dotenv configuration look good. This change effectively enables the server to run in a Node.js environment in addition to Bun.


336-345: Well-structured conditional static file serving.

The approach to conditionally load the appropriate static file middleware based on the runtime environment is good. The different path structures (../ prefix in Node.js) properly account for the differences in how working directories are handled in the two environments.


338-339:

❓ Verification inconclusive

Verify Node.js static file paths.

Please confirm that the static file paths for Node.js are correct relative to the server's execution context. The paths include ../ which assumes the server runs from a subdirectory.


🏁 Script executed:

#!/bin/bash
# Check the execution directory and relative paths

# Print the current working directory
echo "Current working directory: $(pwd)"

# List directories to verify structure
echo "Directory structure:"
ls -la

# Check if these paths exist
echo "Checking for ../dist/index.html:"
test -f "../dist/index.html" && echo "File exists" || echo "File does not exist"

echo "Checking for ./dist/index.html:"
test -f "./dist/index.html" && echo "File exists" || echo "File does not exist" 

Length of output: 1956


Action: Verify Node.js Static Asset Paths in Production

  • The execution test was run from the repository root (/home/jailuser/git), which shows that neither ../dist/index.html nor ./dist/index.html exists in that context.
  • If the server is intended to run from the server directory, the relative path ../dist/index.html would resolve from that subdirectory (assuming a dist folder exists at the repository root). Please confirm that the server’s working directory is correctly set when it is launched.
  • If the server runs from the repository root or another location, the provided paths will not locate the static asset. Adjust the paths accordingly or change the working directory at runtime so that nodeServeStatic({ root: "../", path: "./dist/index.html" }) correctly resolves to the built static file.

Comment on lines +444 to +466
WsApp = app.get(
"/ws",
upgradeWebSocket((c) => {
let connectorId: string | undefined
return {
onOpen(event, ws) {
connectorId = c.req.query("id")
Logger.info(`Websocket connection with id ${connectorId}`)
wsConnections.set(connectorId, ws)
},
onMessage(event, ws) {
Logger.info(`Message from client: ${event.data}`)
ws.send(JSON.stringify({ message: "Hello from server!" }))
},
onClose: (event, ws) => {
Logger.info("Connection closed")
if (connectorId) {
wsConnections.delete(connectorId)
}
},
}
}),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for Bun WebSocket connections.

Similar to the Node.js implementation, the Bun WebSocket implementation should also include error handling.

WsApp = app.get(
  "/ws",
  upgradeWebSocket((c) => {
    let connectorId: string | undefined
    return {
      onOpen(event, ws) {
        connectorId = c.req.query("id")
        Logger.info(`Websocket connection with id ${connectorId}`)
        wsConnections.set(connectorId, ws)
      },
      onMessage(event, ws) {
        Logger.info(`Message from client: ${event.data}`)
        ws.send(JSON.stringify({ message: "Hello from server!" }))
      },
      onClose: (event, ws) => {
        Logger.info("Connection closed")
        if (connectorId) {
          wsConnections.delete(connectorId)
        }
      },
+     onError: (event, ws) => {
+       Logger.error(`WebSocket error: ${event.message}`)
+       if (connectorId) {
+         wsConnections.delete(connectorId)
+       }
+     },
    }
  }),
)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
WsApp = app.get(
"/ws",
upgradeWebSocket((c) => {
let connectorId: string | undefined
return {
onOpen(event, ws) {
connectorId = c.req.query("id")
Logger.info(`Websocket connection with id ${connectorId}`)
wsConnections.set(connectorId, ws)
},
onMessage(event, ws) {
Logger.info(`Message from client: ${event.data}`)
ws.send(JSON.stringify({ message: "Hello from server!" }))
},
onClose: (event, ws) => {
Logger.info("Connection closed")
if (connectorId) {
wsConnections.delete(connectorId)
}
},
}
}),
)
WsApp = app.get(
"/ws",
upgradeWebSocket((c) => {
let connectorId: string | undefined
return {
onOpen(event, ws) {
connectorId = c.req.query("id")
Logger.info(`Websocket connection with id ${connectorId}`)
wsConnections.set(connectorId, ws)
},
onMessage(event, ws) {
Logger.info(`Message from client: ${event.data}`)
ws.send(JSON.stringify({ message: "Hello from server!" }))
},
onClose: (event, ws) => {
Logger.info("Connection closed")
if (connectorId) {
wsConnections.delete(connectorId)
}
},
onError: (event, ws) => {
Logger.error(`WebSocket error: ${event.message}`)
if (connectorId) {
wsConnections.delete(connectorId)
}
},
}
}),
)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
server/server.ts (2)

391-433: ⚠️ Potential issue

Add error handling for WebSocket connections.

The Node.js WebSocket implementation doesn't include error handling for the socket connections. This could lead to unhandled exceptions and potential server crashes.

wss.on("connection", (ws, req) => {
  const { query } = parse(req.url!, true)
  const connectorId = query.id

  Logger.info(`WebSocket connection with id ${connectorId}`)

  if (connectorId) {
    wsConnections.set(connectorId, ws)
  }

  ws.on("message", (message) => {
    Logger.info(`Message from client: ${message}`)
    ws.send(JSON.stringify({ message: "Hello from server!" }))
  })

  ws.on("close", () => {
    Logger.info("Connection closed")
    if (connectorId) {
      wsConnections.delete(connectorId)
    }
  })
+  
+  ws.on("error", (error) => {
+    Logger.error(error, "WebSocket connection error")
+    if (connectorId) {
+      wsConnections.delete(connectorId)
+    }
+  })
})

444-466: ⚠️ Potential issue

Add error handling for Bun WebSocket connections.

Similar to the Node.js implementation, the Bun WebSocket implementation should also include error handling.

WsApp = app.get(
  "/ws",
  upgradeWebSocket((c) => {
    let connectorId: string | undefined
    return {
      onOpen(event, ws) {
        connectorId = c.req.query("id")
        Logger.info(`Websocket connection with id ${connectorId}`)
        wsConnections.set(connectorId, ws)
      },
      onMessage(event, ws) {
        Logger.info(`Message from client: ${event.data}`)
        ws.send(JSON.stringify({ message: "Hello from server!" }))
      },
      onClose: (event, ws) => {
        Logger.info("Connection closed")
        if (connectorId) {
          wsConnections.delete(connectorId)
        }
      },
+     onError: (event, ws) => {
+       Logger.error(`WebSocket error: ${event.message}`)
+       if (connectorId) {
+         wsConnections.delete(connectorId)
+       }
+     },
    }
  }),
)
server/integrations/google/index.ts (1)

1749-1756: Smart conditional file writing implementation.

The code now handles file writing operations in both Node.js and Bun environments, using fs.createWriteStream for Node.js and Bun.file(filePath).writer() for Bun.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2448bb8 and 7ef09f0.

📒 Files selected for processing (6)
  • server/ai/provider/index.ts (1 hunks)
  • server/config.ts (1 hunks)
  • server/integrations/google/index.ts (3 hunks)
  • server/package.json (4 hunks)
  • server/server.ts (3 hunks)
  • server/utils.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • server/config.ts
  • server/utils.ts
  • server/ai/provider/index.ts
  • server/package.json
🧰 Additional context used
🧬 Code Definitions (2)
server/integrations/google/index.ts (1)
server/logger/index.ts (1)
  • getLogger (24-49)
server/server.ts (1)
server/integrations/metricStream.ts (1)
  • wsConnections (3-3)
🔇 Additional comments (5)
server/server.ts (3)

67-73: Good addition of Node.js server dependencies.

The PR adds necessary imports to support running the server in Node.js environment, including the Node server implementation, WebSocket support, and URL parsing utilities.


336-345: Good implementation of conditional static file serving.

The code intelligently selects the appropriate static file serving middleware based on the runtime environment. Note the path differences between Node.js (../dist) and Bun (./dist) environments, which accommodates the different working directory contexts in each runtime.


367-389: Type declaration updated correctly for WsApp.

The change from const to let for WsApp makes sense since its value is conditionally assigned later in the code based on runtime environment.

server/integrations/google/index.ts (2)

114-116: Good addition of Node.js worker and environment setup.

Added the necessary imports to support running the application in Node.js, using the native worker_threads module and configuring environment variables with dotenv.


842-879: Good implementation of cross-runtime worker message handling.

The function setupGmailWorkerHandler is well-implemented to handle worker messages in both Node.js and Bun environments, correctly using the appropriate event handling patterns for each runtime.

Comment on lines 123 to 142
let gmailWorker
if (typeof process !== "undefined" && !("Bun" in globalThis)) {
gmailWorker = new NodeWorker(new URL("gmail-worker.js", import.meta.url))
Logger.info("Gmail worker initialized")
gmailWorker.on("error", (error) => {
Logger.error(
error,
`[NODE] Error in main thread: worker: ${JSON.stringify(error)}`,
)
})
} else {
gmailWorker = new Worker(new URL("gmail-worker.ts", import.meta.url).href)
Logger.info("Gmail worker initialized")
gmailWorker.onerror = (error: ErrorEvent) => {
Logger.error(
error,
`Error in main thread: worker: ${JSON.stringify(error)}`,
)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Well-structured conditional worker initialization.

The code now correctly detects the runtime environment and initializes the Gmail worker accordingly, using the appropriate Worker implementation and file type (JS for Node.js, TS for Bun).

Run the following script to verify that the compiled JS worker file exists for the Node environment:


🏁 Script executed:

#!/bin/bash
# Check if the JS worker file exists
if [ -f server/integrations/google/gmail-worker.js ]; then
  echo "✓ JS worker file exists"
else
  echo "✗ JS worker file doesn't exist - make sure build process generates this file"
fi

# Check if the TS worker file exists
if [ -f server/integrations/google/gmail-worker.ts ]; then
  echo "✓ TS worker file exists"
else
  echo "✗ TS worker file doesn't exist"
fi

Length of output: 334


Attention: Missing Compiled JS Worker File for Node Environment

The worker initialization logic is well structured and correctly differentiates between the Node and Bun environments. However, our verification shows that the compiled JavaScript worker file (server/integrations/google/gmail-worker.js) is missing. This file is required for the Node environment execution. Please ensure that your build process generates the gmail-worker.js file, or adjust the worker initialization to point to the correct file if the expected output is different.

  • Action Required: Update the build process to compile and emit gmail-worker.js for Node.js, or confirm that the intended file is correctly referenced.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
server/server.ts (1)

445-468: Add onError handler to the Bun WebSocket.
This is a duplicate of a previous suggestion regarding error handling for Bun WebSocket.

🧹 Nitpick comments (4)
server/search/types.ts (1)

57-60: Add a brief doc comment for the Env enum.
This helps other developers quickly understand the usage context of each environment type.

docs/README.md (1)

7-21: Streamline instructions to address redundancy.
Replace “Inside of the `server` folder” with “Inside the `server` folder” to align with the style suggestion from the static analysis tool and remove redundant wording.

- - Inside of the `server` folder, instead of running `bun run dev`
+ - Inside the `server` folder, instead of running `bun run dev`
🧰 Tools
🪛 LanguageTool

[style] ~11-~11: This phrase is redundant. Consider using “Inside”.
Context: ...cept the step of starting the server. - Inside of the server folder, instead of running...

(OUTSIDE_OF)

server/server.ts (1)

67-73: Consider using the WHATWG URL API.
Node’s url.parse is deprecated in favor of the URL constructor. Switching to new URL(req.url, ...) can provide better reliability.

server/integrations/google/gmail-worker.ts (1)

33-33: Remove ts-ignore if possible.
Consider installing type definitions for "html-to-text" or adjusting the import to avoid the //@ts-ignore directive and ensure type-safety.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ef09f0 and aac0b08.

📒 Files selected for processing (8)
  • docs/README.md (1 hunks)
  • server/ai/provider/index.ts (1 hunks)
  • server/integrations/google/gmail-worker.ts (4 hunks)
  • server/integrations/google/index.ts (5 hunks)
  • server/package.json (3 hunks)
  • server/search/types.ts (1 hunks)
  • server/server.ts (4 hunks)
  • server/utils.ts (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • server/ai/provider/index.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/package.json
  • server/utils.ts
🧰 Additional context used
🧬 Code Graph Analysis (2)
server/server.ts (2)
server/utils.ts (1)
  • getEnvironment (12-18)
server/integrations/metricStream.ts (1)
  • wsConnections (3-3)
server/integrations/google/index.ts (2)
server/utils.ts (1)
  • getEnvironment (12-18)
server/logger/index.ts (1)
  • getLogger (24-49)
🪛 LanguageTool
docs/README.md

[style] ~11-~11: This phrase is redundant. Consider using “Inside”.
Context: ...cept the step of starting the server. - Inside of the server folder, instead of running...

(OUTSIDE_OF)

🔇 Additional comments (18)
server/server.ts (11)

49-49: LGTM for new import.
No issues spotted. The reference to getEnvironment and setCookieByEnv clarifies the environment usage.


65-66: LGTM for new StatusCode import.
No concerns regarding the addition of StatusCode.


74-74: LGTM on environment detection.
This line clearly conveys the environment is determined at runtime.


337-347: Verify the static file paths.
Ensuring that the relative paths ../ and ./dist remain correct after build is crucial. If your Node environment runs from a different working directory, these might fail.


349-349: Confirm whether authentication is desired for root path.
So that the landing page becomes accessible only to authorized users. If this is intentional, proceed; otherwise, consider removing the AuthRedirect.


351-351: LGTM on allowing open access to auth route.
This approach ensures that the login page remains accessible without requiring prior authentication.


353-353: LGTM on protecting the /search route.
User must be authenticated before searching, which is appropriate for restricting private data access.


355-355: LGTM on parameterized chat route.
This ensures that any chat detail path remains protected via authentication.


356-356: LGTM on requiring authentication for admin integrations.
Restricting admin routes is important for security.


359-359: LGTM on unprotected static assets.
Serving static assets without authentication is standard practice.


368-390: Confirm the need for a mutable export.
Exporting as let is correct if you're reassigning WsApp in the Bun environment block; otherwise, prefer const.

server/integrations/google/gmail-worker.ts (4)

10-10: LGTM for importing Env.
No issues spotted.


31-31: LGTM for unified environment logic import.
Centralizing environment-related logic is a sound approach.


45-45: LGTM for worker_threads import.
Enabling Node-specific features for the worker aligns with environment detection logic.


70-119: LGTM on environment-based message handling.
Separate flows for Node.js (parentPort) and the default self.onmessage ensure broader portability of the Gmail worker.

server/integrations/google/index.ts (3)

29-29: Environment detection imports and setup look good.

The imports for environment detection (getEnvironment, Env) and Node worker support (Worker as NodeWorker) have been properly added. The ES module syntax for the HTML-to-text library is also appropriate for better compatibility.

Also applies to: 72-72, 115-117, 120-121


750-757: Environment-aware file writer creation.

The conditional creation of a writer based on the runtime environment is consistent with how the worker is handled, using fs.createWriteStream for Node and Bun.file for Bun.


844-879: Robust event handler setup for cross-runtime workers.

The worker message handling is properly set up for both environments, using the appropriate event binding methods (on("message") for Node vs onmessage for Bun) and correctly handles type casting.

Comment on lines +392 to +434
if (environment === Env.Node) {
const server = serve({ fetch: app.fetch, port: config.port })

const wss = new WebSocketServer({ noServer: true })

server.on("upgrade", (req, socket, head) => {
const { pathname } = parse(req.url, true)

if (pathname === "/ws") {
// If the request is for the /ws endpoint, upgrade the connection.
wss.handleUpgrade(req, socket, head, (ws) => {
wss.emit("connection", ws, req)
})
} else {
// Otherwise, destroy the socket.
socket.destroy()
}
})

wss.on("connection", (ws, req) => {
const { query } = parse(req.url!, true)
const connectorId = query.id

Logger.info(`WebSocket connection with id ${connectorId}`)

if (connectorId) {
wsConnections.set(connectorId, ws)
}

ws.on("message", (message) => {
Logger.info(`Message from client: ${message}`)
ws.send(JSON.stringify({ message: "Hello from server!" }))
})

ws.on("close", () => {
Logger.info("Connection closed")
if (connectorId) {
wsConnections.delete(connectorId)
}
})
})
Logger.info(`[NODE] listening on port: ${config.port}`)
} else {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add error handling for Node.js WebSocket connections.
The Node.js WebSocket implementation doesn’t include an error event listener. This could lead to unhandled exceptions and potential server crashes during edge cases.

Comment on lines +435 to +444
const { createBunWebSocket } = await import("hono/bun")
const { upgradeWebSocket, websocket } = createBunWebSocket<ServerWebSocket>()
const server = Bun.serve({
fetch: app.fetch,
port: config.port,
websocket,
idleTimeout: 180,
})
Logger.info(`[BUN] listening on port: ${config.port}`)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding an onError handler for Bun WebSocket.
Similar to the Node environment, the Bun WebSocket might need explicit error handling for stable operation.

Comment on lines +124 to +143
let gmailWorker
if (environment === Env.Node) {
gmailWorker = new NodeWorker(new URL("gmail-worker.js", import.meta.url))
Logger.info("Gmail worker initialized")
gmailWorker.on("error", (error) => {
Logger.error(
error,
`[NODE] Error in main thread: worker: ${JSON.stringify(error)}`,
)
})
} else {
gmailWorker = new Worker(new URL("gmail-worker.ts", import.meta.url).href)
Logger.info("Gmail worker initialized")
gmailWorker.onerror = (error: ErrorEvent) => {
Logger.error(
error,
`Error in main thread: worker: ${JSON.stringify(error)}`,
)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Well-structured conditional worker initialization.

The code now correctly detects the runtime environment and initializes the Gmail worker accordingly, using the appropriate Worker implementation and file type (JS for Node.js, TS for Bun).

Run the following script to verify that the compiled JS worker file exists for the Node environment:


🏁 Script executed:

#!/bin/bash
# Check if the JS worker file exists
if [ -f server/integrations/google/gmail-worker.js ]; then
  echo "✓ JS worker file exists"
else
  echo "✗ JS worker file doesn't exist - make sure build process generates this file"
fi

# Check if the TS worker file exists
if [ -f server/integrations/google/gmail-worker.ts ]; then
  echo "✓ TS worker file exists"
else
  echo "✗ TS worker file doesn't exist"
fi

Length of output: 334


Attention: Missing Compiled JS Worker File in Node Environment

  • The conditional initialization of the Gmail worker is well-structured, distinguishing between the Node and non-Node environments.
  • However, the verification script indicates that the required gmail-worker.js file (used in the Node environment via NodeWorker) does not exist in server/integrations/google/.
  • The TypeScript worker file (gmail-worker.ts) is present, meaning the build process is not generating the corresponding compiled JS file for Node.
  • Action Required: Please ensure that the build process generates the gmail-worker.js file or adjust the Node worker initialization to load the appropriate file if this was an oversight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant