Skip to content
This repository was archived by the owner on Aug 1, 2025. It is now read-only.

Conversation

@ichim-david
Copy link
Contributor

@ichim-david ichim-david commented Jul 7, 2025

Refs CODY-5219

Allow selecting of branches before choosing a file or a directory when using the remote file and directory option

Test plan

  • Test remote directories that are able to be referenced
  • Test that if you select a remote file or directory you can select a branch or search for a branch not found in that list (first 10) if there are many branches and then use the selected context from that branch

This commit enhances the mention menu to display the branch name
associated with the context item, improving user clarity.

- Adds a `getBranchHelpText` function to determine the appropriate help text based on available branch information.
- The function checks if the branch is specified in the mention data or the query string.
- Updates the MentionMenu component to use the `getBranchHelpText` function to display the branch name.
- Adds unit tests for the `getBranchHelpText` function and `extractRepoAndBranch` function to ensure correct behavior.
- Modifies `extractRepoAndBranch` to support parsing repo and branch from strings like "repo@branch", "repo", or "repo:directory@branch".
- Updates the query to include the branch when searching for directories.
This commit introduces several enhancements to the remote directory search functionality, improving the accuracy and usability of directory context within the Cody extension.

- Improves the parsing of repo and branch from the query string in `extractRepoAndBranch` function.
- Adds tests for `github.com/mrdoob/three.js@dev` parsing.
- Implements branch selection for root directory search.
- Implements branch selection with directory path filtering.
- Introduces `getDirectoryContents` query to fetch directory contents.
- Updates `getDirectoryItem` to use `getDirectoryContents` and filter directory entries.
- Adds DIRECTORY_CONTENTS_QUERY to GraphQL queries.
- Adds `getDirectoryContents` method to `SourcegraphGraphQLAPIClient`.
path
byteSize
url
content
Copy link
Contributor

@peterguy peterguy Jul 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examining the rest of the PR, it looks like you don't need content. content is pretty expensive, so you should use it only when you need the file contents immediately. If you think you may need the file contents, you can use rawURL instead of content and then use that URL to retrieve the content.

Additionally, examine whether you need the blob at all - it looks like this PR is using the directories, but not the files, so you may be able to ditch the file info (GitBlob) completely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the blob comment

@ichim-david
Copy link
Contributor Author

@peterguy I'm looking at addressing your feedback.
I noticed that when we perform the remote directories mention we use the CONTEXT_SEARCH_QUERY_WITH_RANGES query
https://github.com/sourcegraph/cody/blob/main/lib/shared/src/sourcegraph-api/graphql/queries.ts#L259
which just does a file chunk reply instead of using the entire file.
Ex:
search-context-with-range-results

If we do just a remote file then we use the entire file as context.
Loom Screenshot 2025-07-13 at 03 28 42

The new getDirectory with it's conditions would of returned only the files found in that folder and not it's children
getDirec
and I think contents help here in order to use the entire file content but I could be wrong.

I know that if I manually add some context I would expect for the entire file to be read instead of just chunks, was trying this on Cursor and although the folder is local and not remote it was able to properly read the contents of all 4 files that were found in that directory.
cursor-mention-directories

If we want to keep this light chunk but more encompassing search that finds files also in children directories then I will just look at adding the information regarding the branch searching and selecting and keep the remote directory context with the file chunk as it is.

@peterguy
Copy link
Contributor

peterguy commented Jul 16, 2025

@ichim-david in general, even if you’re using file contents, you should return rawURL and make a subsequent call to that URL, rather than adding the burden of the file contents to the GQL query.

…earch

This commit enhances the retrieval of remote directory items by utilizing context search for improved accuracy and performance. It also introduces a fallback mechanism to `getDirectoryContents` if context search returns no results.

- Implements context search for retrieving directory items using `graphqlClient.contextSearch`.
- Adds a fallback mechanism to `getDirectoryContents` if context search returns no results.
- Modifies `getDirectoryItem` to accept `repoID` instead of `query`.
- Updates `getDirectoryItem` to use directory basename as search query since contextSearch requires non-empty query.
- Adds revision support to context search.
- Adds repoName and revision to context item mentions.
- Reduces the number of entries fetched in `GetDirectoryContents` query to 15.
- Updates file pattern regex for root directory search in `getDirectoryMentions` to ensure it matches from the start of the string.
- Constructs URL with branch information in `getDirectoryMentions` if available.
- Fixes parsing of repo and branch from query string in `extractRepoAndBranch` function by using `indexOf` instead of `lastIndexOf` to find the `@` symbol.
This commit addresses issues in the remote directory search tests and improves URL generation for directory mentions.

- Updates test assertions to use the correct server endpoint from the mock configuration.
- Mocks `contextSearch` to return an empty array to trigger fallback in tests.
- Adds `repoID` to the data object in directory contents tests.
- Corrects the URL generation in `getDirectoryMentions` to use `/tree/` instead of `/blob/` for directory paths.
- Updates the file pattern regex in `searchFileMatches` to ensure it matches from the start of the string.
…d fuzzy matching

This commit introduces several enhancements to the remote directory search functionality, including improved branch handling and fuzzy matching for branch names, enhancing the user experience when searching for directories within specific branches.

- Implements branch suggestions when the user types `repo:` to show available branches.
- Adds fuzzy matching for branch names when the user types `repo:@query` to filter branches.
- Supports branch names with hyphens, slashes, and dots in `extractRepoAndBranch`.
- Improves the display of branch information in the mention menu, including help text indicating the source branch.
- Updates the mention menu to display the correct query label with branch information.
- Fixes the search query to include the branch name when searching for directories.
- Adds tests for branch name parsing and branch suggestions.
…rowsing

This commit introduces branch suggestions and enhances remote file browsing within OpenCtx, allowing users to easily navigate and search for files within specific branches of a repository.

- Adds `getBranchMentions` to fetch branch suggestions for a given repository.
- Implements branch filtering based on user input.
- Adds support for browsing files in a specific branch using the `repo@branch:path` format.
- Updates `createRepositoryMention` to include `defaultBranch` and `branches` in the mention data.
- Modifies `getFileMentions` to include an optional branch parameter in the search query.
- Creates `createBranchMentionsFromData` to create mention objects for branches.
- Introduces `parseRemoteQuery` to parse the query string and extract repository name, branch, and path components.
…ents

This commit enhances the remote directory search functionality by utilizing the `getDirectoryContents` API to fetch directory contents directly, improving performance and reliability.

- Replaces the previous implementation that relied on `contextSearch` and a fallback to `getDirectoryContents` with a direct call to `getDirectoryContents`.
- Adds a new `fetchContentFromRawURL` method to fetch the content of files from their raw URLs.
- Updates the `getDirectoryItem` function to use the `getDirectoryContents` API and `fetchContentFromRawURL` to retrieve file entries and their content.
- Filters out directories and binary files from the results.
- Exports `DirectoryContentsResponse` type.
- Updates the `GetDirectoryContents` query to include `rawURL` and `binary` fields.
Also sort directory listing by path name as it happens on online listing
@ichim-david ichim-david changed the title feat(context): Display branch name in mention menu help text feat(mention): allow selection of context from branches for remote files and directories Jul 30, 2025
@ichim-david ichim-david requested a review from julialeex July 30, 2025 20:30
@julialeex julialeex requested a review from peterguy July 30, 2025 20:53
@julialeex
Copy link
Contributor

Screenshot 2025-07-30 at 2 15 10 PM

Can we fix the %40 and change it to @?

path
byteSize
url
content
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the blob comment

branchQuery,
})
} catch (error) {
// If the search fails, return empty array to fall back to client-side filtering
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can add debug logs here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the try catch, regarding logging, even in the previous remoteDirectorySearch we weren't logging but I guess we can add some

* Extracts repo name and optional branch from a string.
* Supports formats: "repo@branch", "repo", or "repo:directory@branch"
*/
export function extractRepoAndBranch(input: string): [string, string | undefined] {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removeFileSearch.ts has a similar logic for extracting the branch. Can we reuse the code somehow?


const url = `${serverEndpoint.replace(/\/$/, '')}${result.file.url}`
// Construct URL with branch information if available
const baseUrl = `${serverEndpoint.replace(/\/$/, '')}/${result.repository.name}`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

${serverEndpoint.replace(/\/$/, '')} is also being used at multiple places.


let content = ''
const rawContent = await graphqlClient.fetchContentFromRawURL(entry.rawURL)
if (!isError(rawContent)) {
Copy link
Contributor

@julialeex julialeex Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add some error handling here (ex: a warning/debug log)

} = await currentResolvedConfig()

const items: Item[] = []
for (const entry of entries) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can fetch all contents in parallel instead of sequentially here for better performance.

@julialeex julialeex requested a review from thenamankumar July 30, 2025 21:52
@julialeex
Copy link
Contributor

julialeex commented Jul 30, 2025

This PR is pretty big. I'm adding @thenamankumar to get a third pair of eyes 🙏

Thank you @ichim-david for implementing this!

Copy link
Contributor

@julialeex julialeex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve to unblock. Thanks David!

@ichim-david
Copy link
Contributor Author

@ichim-david
Copy link
Contributor Author

@peterguy I did as you've asked, I've removed the content and I am using rawURL to fetch the contents.
@thenamankumar maybe you would like to take a look at this as well since you worked on the remote file and directory ... some things can probably be abstracted further and code duplication be removed some more but I think this is a good start and it might be worth having in the pre-release

Copy link
Contributor

@peterguy peterguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My major concerns about fetching blob contents about were addressed

@ichim-david
Copy link
Contributor Author

@thenamankumar You're the only one left :) feel free to improve this code anyway you seem fit in order to get this merged, now that Julia and Peter have accepted the pr.
Do take a look at the video where I showcases how the flow is now of repo search -> branch select or search -> directory select or search

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants