-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add GitFile Component #5458
base: main
Are you sure you want to change the base?
feat: add GitFile Component #5458
Conversation
744298e
to
57f6ce0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at the other Git component to check an example on how to use anyio.Path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, @raphaelchristi !
@asynccontextmanager | ||
async def temp_git_repo(self): | ||
"""Async context manager for temporary git repository cloning.""" | ||
temp_dir = tempfile.mkdtemp() | ||
try: | ||
yield temp_dir | ||
finally: | ||
await asyncio.get_event_loop().run_in_executor(None, lambda: shutil.rmtree(temp_dir, ignore_errors=True)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asynccontextmanager | |
async def temp_git_repo(self): | |
"""Async context manager for temporary git repository cloning.""" | |
temp_dir = tempfile.mkdtemp() | |
try: | |
yield temp_dir | |
finally: | |
await asyncio.get_event_loop().run_in_executor(None, lambda: shutil.rmtree(temp_dir, ignore_errors=True)) | |
@asynccontextmanager | |
async def temp_git_repo(self): | |
"""Context manager for handling temporary clone directory.""" | |
temp_dir = None | |
try: | |
temp_dir = tempfile.mkdtemp(prefix="langflow_clone_") | |
yield temp_dir | |
finally: | |
if temp_dir: | |
await anyio.Path(temp_dir).rmdir() |
await asyncio.get_event_loop().run_in_executor( | ||
None, | ||
lambda: git.Repo.clone_from( | ||
self.repository_url, temp_dir, branch=self.branch, depth=1, single_branch=True | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work. Use this instead:
await asyncio.get_event_loop().run_in_executor( | |
None, | |
lambda: git.Repo.clone_from( | |
self.repository_url, temp_dir, branch=self.branch, depth=1, single_branch=True | |
), | |
) | |
await asyncio.to_thread( | |
git.Repo.clone_from, | |
self.repository_url, | |
temp_dir, | |
branch=self.branch, | |
depth=1, | |
single_branch=True, | |
) |
if not self.repository_url: | ||
return [Data(data={"error": "Please enter a repository URL"})] | ||
if not self.branch or self.branch == "Enter repository URL first": | ||
return [Data(data={"error": "Please select a branch"})] | ||
if not self.selected_files: | ||
return [Data(data={"error": "Please select at least one file"})] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be exceptions.
await asyncio.get_event_loop().run_in_executor( | ||
None, | ||
lambda: git.Repo.clone_from( | ||
self.repository_url, temp_dir, branch=self.branch, depth=1, single_branch=True | ||
), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never use asyncio.get_event_loop() because there may not be a running event loop which will raise an error. Always use asyncio.to_thread
.
- Replace asyncio.get_event_loop() with asyncio.to_thread() - Add proper error handling with GitFileError - Improve temporary directory cleanup - Make required fields explicit - Add type hints and docstrings - Enhance error messages and logging Fixes langflow-ai#5458
ddd877e
to
c3b8418
Compare
…agement - Introduced `anyio` for improved async file operations. - Updated temporary directory creation and cleanup logic for better error handling and efficiency. - Added a test case handling mechanism in the `__call__` method. - Improved binary file detection and content reading using `aiofiles`. - Enhanced error logging during temporary directory cleanup. These changes aim to streamline the GitFileComponent's functionality and improve overall performance.
- Introduced comprehensive unit tests for the GitFileComponent to validate its behavior. - Added tests for handling missing repository URL, branch, and selected files. - Implemented tests for binary and text file detection. - Created tests for retrieving branches and successfully getting file contents. - Utilized mocking to simulate Git operations and file handling. These tests aim to ensure the reliability and correctness of the GitFileComponent's functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Utilise real_time_refresh
Also utilise update_build_config to update the fields data instead of multiple refresh buttons.
CodSpeed Performance ReportMerging #5458 will improve performances by 13.74%Comparing Summary
Benchmarks breakdown
|
This pull request introduces a new GitFileComponent that allows users to analyze Git repositories and retrieve the content of selected files from specified branches. The component provides the following features: