-
Notifications
You must be signed in to change notification settings - Fork 22
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
dc19950
commit b5defcd
Showing
9 changed files
with
224 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,89 @@ | ||
# Contributing to DSPyGen | ||
|
||
First of all, thank you for considering contributing to DSPyGen! We appreciate your time and effort in helping improve our project. By contributing, you can help enhance the functionality, fix bugs, and add new features to make DSPyGen even better. | ||
|
||
## How to Contribute | ||
|
||
### Reporting Bugs | ||
|
||
If you find a bug, please report it by opening an issue on our GitHub repository. Make sure to include: | ||
|
||
- A clear and descriptive title. | ||
- A detailed description of the problem. | ||
- Steps to reproduce the issue. | ||
- Any relevant logs, screenshots, or code snippets. | ||
|
||
### Suggesting Enhancements | ||
|
||
If you have ideas for new features or enhancements, feel free to suggest them by opening an issue on our GitHub repository. Please include: | ||
|
||
- A clear and descriptive title. | ||
- A detailed description of the enhancement. | ||
- Any additional context or information that might be helpful. | ||
|
||
### Code Contributions | ||
|
||
We welcome code contributions! To contribute, please follow these steps: | ||
|
||
1. **Fork the Repository:** | ||
- Go to the [DSPyGen GitHub repository](https://github.com/seanchatmangpt/dspygen) and click on the "Fork" button. | ||
|
||
2. **Clone the Repository:** | ||
- Clone your forked repository to your local machine: | ||
```sh | ||
git clone https://github.com/your-username/dspygen.git | ||
cd dspygen | ||
``` | ||
|
||
3. **Create a Branch:** | ||
- Create a new branch for your feature or bugfix: | ||
```sh | ||
git checkout -b feature/your-feature-name | ||
``` | ||
|
||
4. **Make Changes:** | ||
- Make your changes in the codebase. Ensure that your code follows the project's coding standards and passes all tests. | ||
5. **Commit Changes:** | ||
- Commit your changes with a descriptive commit message: | ||
```sh | ||
git add . | ||
git commit -m "Add detailed description of your changes" | ||
``` | ||
6. **Push Changes:** | ||
- Push your changes to your forked repository: | ||
```sh | ||
git push origin feature/your-feature-name | ||
``` | ||
7. **Open a Pull Request:** | ||
- Go to the original [DSPyGen repository](https://github.com/seanchatmangpt/dspygen) and open a pull request. Provide a detailed description of your changes and reference any related issues. | ||
### Review Process | ||
- All pull requests will be reviewed by a maintainer. We may suggest changes or ask for additional information before merging. | ||
- Ensure your pull request passes all CI/CD checks and includes relevant tests. | ||
- Be responsive to feedback and make necessary changes promptly. | ||
### Coding Standards | ||
- Follow the existing code style and conventions. | ||
- Write clear, concise, and well-documented code. | ||
- Include tests for new features and bug fixes. | ||
- Update documentation as needed. | ||
### Community Guidelines | ||
- Be respectful and considerate in all communications. | ||
- Provide constructive feedback and be open to receiving it. | ||
- Collaborate and help others whenever possible. | ||
## Contact | ||
If you have any questions or need further assistance, feel free to reach out by opening an issue or contacting the maintainers. | ||
Thank you for contributing to DSPyGen! Together, we can create a powerful tool for AI development. | ||
--- | ||
For more information, visit our [GitHub repository](https://github.com/seanchatmangpt/dspygen). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api" | |
|
||
[tool.poetry] # https://python-poetry.org/docs/pyproject/ | ||
name = "dspygen" | ||
version = "2024.5.16" | ||
version = "2024.5.29" | ||
description = "A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama." | ||
authors = ["Sean Chatman <[email protected]>"] | ||
readme = "README.md" | ||
|
File renamed without changes.
Empty file.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
|
||
import subprocess | ||
import time | ||
from playwright.sync_api import sync_playwright | ||
|
||
def start_browser_with_debugging(): | ||
# Define the command to start the browser with remote debugging enabled | ||
chrome_path = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" | ||
user_data_dir = "/Users/sac/Library/Application Support/Google/Chrome/Profile 1" | ||
remote_debugging_port = 9222 | ||
command = [ | ||
chrome_path, | ||
f"--remote-debugging-port={remote_debugging_port}", | ||
f"--user-data-dir={user_data_dir}", | ||
] | ||
|
||
print(str(command)) | ||
|
||
# Start the browser process | ||
process = subprocess.Popen(command) | ||
return process | ||
|
||
|
||
def main(): | ||
"""Main function""" | ||
from dspygen.utils.dspy_tools import init_ol | ||
init_ol() | ||
|
||
# Start the browser | ||
browser_process = start_browser_with_debugging() | ||
|
||
# Give the browser some time to start | ||
time.sleep(5) | ||
|
||
# Connect Playwright to the running browser | ||
with sync_playwright() as p: | ||
browser = p.chromium.connect_over_cdp(f'http://localhost:9222') | ||
|
||
# Optionally, list all open pages | ||
context = browser.contexts[0] | ||
page = context.pages[0] if context.pages else context.new_page() | ||
|
||
# Navigate to a URL or perform any actions | ||
page.goto('https://example.com') | ||
|
||
# Example: Take a screenshot to verify the connection | ||
page.screenshot(path='example.png') | ||
|
||
# Close the page and context if you don't need them anymore | ||
page.close() | ||
context.close() | ||
|
||
# Note: Do not close the browser as it is managed externally | ||
# browser.close() | ||
|
||
# Terminate the browser process if needed | ||
# browser_process.terminate() | ||
|
||
|
||
if __name__ == '__main__': | ||
main() |
66 changes: 66 additions & 0 deletions
66
src/dspygen/modules/extract_css_selectors_for_playwright_module.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
import dspy | ||
|
||
import dspy | ||
|
||
|
||
class ExtractCSSSelectorsForPlaywright(dspy.Signature): | ||
""" | ||
Extract CSS selector from an HTML script for Playwright to spider the site. | ||
""" | ||
html_script = dspy.InputField(desc="HTML script of the web page.") | ||
why = dspy.InputField(desc="Reason or purpose for extracting the CSS selector.") | ||
what = dspy.InputField(desc="Element or attribute to target for CSS selector extraction.") | ||
how = dspy.InputField(desc="Specific conditions or rules to apply during extraction.") | ||
css_selector = dspy.OutputField(desc="CSS selector to be used by Playwright for web scraping.", | ||
prefix="```python\nselector = '",) | ||
|
||
|
||
class ExtractCSSSelectorsForPlaywrightModule(dspy.Module): | ||
"""ExtractCSSSelectorsForPlaywrightModule""" | ||
|
||
def __init__(self, **forward_args): | ||
super().__init__() | ||
self.forward_args = forward_args | ||
self.output = None | ||
|
||
def forward(self, html_script, why, what, how): | ||
pred = dspy.Predict(ExtractCSSSelectorsForPlaywright) | ||
self.output = pred(html_script=html_script, why=why, what=what, how=how).css_selector | ||
return self.output | ||
|
||
|
||
def extract_css_selectors_for_playwright_call(html_script, why, what, how): | ||
extract_css_selectors_for_playwright = ExtractCSSSelectorsForPlaywrightModule() | ||
return extract_css_selectors_for_playwright.forward(html_script=html_script, why=why, what=what, how=how) | ||
|
||
|
||
def main(): | ||
from dspygen.utils.dspy_tools import init_dspy, init_ol | ||
|
||
init_dspy() | ||
# init_ol() | ||
html_script = """ | ||
<html> | ||
<head> | ||
<style> | ||
.main { color: red; } | ||
#unique { background-color: blue; } | ||
div > p { margin: 10px; } | ||
</style> | ||
</head> | ||
<body> | ||
<div class="main"> | ||
<p id="unique">Hello World!</p> | ||
</div> | ||
</body> | ||
</html> | ||
""" | ||
why = "To find the main content section for scraping." | ||
what = "div with class 'main'" | ||
how = "Select the first matching element." | ||
result = extract_css_selectors_for_playwright_call(html_script=html_script, why=why, what=what, how=how).split("'")[0] | ||
print(result) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters