Skip to content

Commit

Permalink
Adding CONTRIBUTING.md
Browse files Browse the repository at this point in the history
  • Loading branch information
seanchatmangpt committed May 29, 2024
1 parent dc19950 commit b5defcd
Show file tree
Hide file tree
Showing 9 changed files with 224 additions and 7 deletions.
89 changes: 89 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Contributing to DSPyGen

First of all, thank you for considering contributing to DSPyGen! We appreciate your time and effort in helping improve our project. By contributing, you can help enhance the functionality, fix bugs, and add new features to make DSPyGen even better.

## How to Contribute

### Reporting Bugs

If you find a bug, please report it by opening an issue on our GitHub repository. Make sure to include:

- A clear and descriptive title.
- A detailed description of the problem.
- Steps to reproduce the issue.
- Any relevant logs, screenshots, or code snippets.

### Suggesting Enhancements

If you have ideas for new features or enhancements, feel free to suggest them by opening an issue on our GitHub repository. Please include:

- A clear and descriptive title.
- A detailed description of the enhancement.
- Any additional context or information that might be helpful.

### Code Contributions

We welcome code contributions! To contribute, please follow these steps:

1. **Fork the Repository:**
- Go to the [DSPyGen GitHub repository](https://github.com/seanchatmangpt/dspygen) and click on the "Fork" button.

2. **Clone the Repository:**
- Clone your forked repository to your local machine:
```sh
git clone https://github.com/your-username/dspygen.git
cd dspygen
```

3. **Create a Branch:**
- Create a new branch for your feature or bugfix:
```sh
git checkout -b feature/your-feature-name
```

4. **Make Changes:**
- Make your changes in the codebase. Ensure that your code follows the project's coding standards and passes all tests.
5. **Commit Changes:**
- Commit your changes with a descriptive commit message:
```sh
git add .
git commit -m "Add detailed description of your changes"
```
6. **Push Changes:**
- Push your changes to your forked repository:
```sh
git push origin feature/your-feature-name
```
7. **Open a Pull Request:**
- Go to the original [DSPyGen repository](https://github.com/seanchatmangpt/dspygen) and open a pull request. Provide a detailed description of your changes and reference any related issues.
### Review Process
- All pull requests will be reviewed by a maintainer. We may suggest changes or ask for additional information before merging.
- Ensure your pull request passes all CI/CD checks and includes relevant tests.
- Be responsive to feedback and make necessary changes promptly.
### Coding Standards
- Follow the existing code style and conventions.
- Write clear, concise, and well-documented code.
- Include tests for new features and bug fixes.
- Update documentation as needed.
### Community Guidelines
- Be respectful and considerate in all communications.
- Provide constructive feedback and be open to receiving it.
- Collaborate and help others whenever possible.
## Contact
If you have any questions or need further assistance, feel free to reach out by opening an issue or contacting the maintainers.
Thank you for contributing to DSPyGen! Together, we can create a powerful tool for AI development.
---
For more information, visit our [GitHub repository](https://github.com/seanchatmangpt/dspygen).
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "poetry.core.masonry.api"

[tool.poetry] # https://python-poetry.org/docs/pyproject/
name = "dspygen"
version = "2024.5.16"
version = "2024.5.29"
description = "A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama."
authors = ["Sean Chatman <[email protected]>"]
readme = "README.md"
Expand Down
File renamed without changes.
Empty file.
Binary file added src/dspygen/experiments/auto_spider/example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
61 changes: 61 additions & 0 deletions src/dspygen/experiments/auto_spider/spider_main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@

import subprocess
import time
from playwright.sync_api import sync_playwright

def start_browser_with_debugging():
# Define the command to start the browser with remote debugging enabled
chrome_path = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
user_data_dir = "/Users/sac/Library/Application Support/Google/Chrome/Profile 1"
remote_debugging_port = 9222
command = [
chrome_path,
f"--remote-debugging-port={remote_debugging_port}",
f"--user-data-dir={user_data_dir}",
]

print(str(command))

# Start the browser process
process = subprocess.Popen(command)
return process


def main():
"""Main function"""
from dspygen.utils.dspy_tools import init_ol
init_ol()

# Start the browser
browser_process = start_browser_with_debugging()

# Give the browser some time to start
time.sleep(5)

# Connect Playwright to the running browser
with sync_playwright() as p:
browser = p.chromium.connect_over_cdp(f'http://localhost:9222')

# Optionally, list all open pages
context = browser.contexts[0]
page = context.pages[0] if context.pages else context.new_page()

# Navigate to a URL or perform any actions
page.goto('https://example.com')

# Example: Take a screenshot to verify the connection
page.screenshot(path='example.png')

# Close the page and context if you don't need them anymore
page.close()
context.close()

# Note: Do not close the browser as it is managed externally
# browser.close()

# Terminate the browser process if needed
# browser_process.terminate()


if __name__ == '__main__':
main()
66 changes: 66 additions & 0 deletions src/dspygen/modules/extract_css_selectors_for_playwright_module.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
import dspy

import dspy


class ExtractCSSSelectorsForPlaywright(dspy.Signature):
"""
Extract CSS selector from an HTML script for Playwright to spider the site.
"""
html_script = dspy.InputField(desc="HTML script of the web page.")
why = dspy.InputField(desc="Reason or purpose for extracting the CSS selector.")
what = dspy.InputField(desc="Element or attribute to target for CSS selector extraction.")
how = dspy.InputField(desc="Specific conditions or rules to apply during extraction.")
css_selector = dspy.OutputField(desc="CSS selector to be used by Playwright for web scraping.",
prefix="```python\nselector = '",)


class ExtractCSSSelectorsForPlaywrightModule(dspy.Module):
"""ExtractCSSSelectorsForPlaywrightModule"""

def __init__(self, **forward_args):
super().__init__()
self.forward_args = forward_args
self.output = None

def forward(self, html_script, why, what, how):
pred = dspy.Predict(ExtractCSSSelectorsForPlaywright)
self.output = pred(html_script=html_script, why=why, what=what, how=how).css_selector
return self.output


def extract_css_selectors_for_playwright_call(html_script, why, what, how):
extract_css_selectors_for_playwright = ExtractCSSSelectorsForPlaywrightModule()
return extract_css_selectors_for_playwright.forward(html_script=html_script, why=why, what=what, how=how)


def main():
from dspygen.utils.dspy_tools import init_dspy, init_ol

init_dspy()
# init_ol()
html_script = """
<html>
<head>
<style>
.main { color: red; }
#unique { background-color: blue; }
div > p { margin: 10px; }
</style>
</head>
<body>
<div class="main">
<p id="unique">Hello World!</p>
</div>
</body>
</html>
"""
why = "To find the main content section for scraping."
what = "div with class 'main'"
how = "Select the first matching element."
result = extract_css_selectors_for_playwright_call(html_script=html_script, why=why, what=what, how=how).split("'")[0]
print(result)


if __name__ == "__main__":
main()
5 changes: 3 additions & 2 deletions src/dspygen/rdddy/browser/browser_process_supervisor.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from dspygen.rdddy.browser.browser_worker import BrowserWorker


os.environ["PLAYWRIGHT_BROWSER"] = "/Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary"
os.environ["PLAYWRIGHT_BROWSER"] = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"


class BrowserProcessSupervisor(BaseActor):
Expand Down Expand Up @@ -91,10 +91,11 @@ async def handle_browser_status_event(self, event: BrowserStatusEvent):
async def main():
actor_system = ActorSystem()
proc_supervisor = await actor_system.actor_of(BrowserProcessSupervisor)
browser_actor = await actor_system.actor_of(BrowserWorker)
browser_actor = await actor_system.actor_of(BrowserWorker, )

# Start Chrome Browser
await actor_system.publish(StartBrowserCommand())
# await actor_system.publish(Goto(url="https://www.google.com"))

# await actor_system.publish(StopBrowserCommand())

Expand Down
8 changes: 4 additions & 4 deletions src/dspygen/rm/chatgpt_chromadb_retriever.py
Original file line number Diff line number Diff line change
Expand Up @@ -259,11 +259,11 @@ def main():

init_ol(model="phi3:medium", max_tokens=5000, timeout=500)

retriever = ChatGPTChromaDBRetriever(check_for_updates=True)
retriever._update_collection_metadata()
retriever = ChatGPTChromaDBRetriever(check_for_updates=False)
# retriever._update_collection_metadata()

query = "Fixed and running Tetris pygame"
matched_conversations = retriever.forward(query, k=5)
query = "YAML"
matched_conversations = retriever.forward(query, k=10, contains="interactible")
# print(count_tokens(str(matched_conversations) + "\nI want a DSPy module that generates Python source code."))
for conversation in matched_conversations:
logger.info(conversation)
Expand Down

0 comments on commit b5defcd

Please sign in to comment.