Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
bdbadfa
Add FAQ automation system with GitHub Actions integration
frederick-douglas-pearce Oct 18, 2025
2e223aa
Fix minsearch version requirement (0.0.7 instead of 0.4.1)
frederick-douglas-pearce Oct 19, 2025
786d850
Configure setuptools to only package faq_automation
frederick-douglas-pearce Oct 19, 2025
2f5f40e
Change default model to gpt-5-nano for structured output support
frederick-douglas-pearce Oct 20, 2025
c882962
Remove test artifacts and add *.egg-info/ to gitignore
frederick-douglas-pearce Oct 20, 2025
2502ad7
Add *.egg-info/ to gitignore
frederick-douglas-pearce Oct 20, 2025
daea8fa
Add contribution banner to course pages
frederick-douglas-pearce Oct 21, 2025
789404f
Increase contribution banner font size to match questions
frederick-douglas-pearce Oct 21, 2025
eab1597
Update issue template with dropdown for course selection
frederick-douglas-pearce Oct 21, 2025
c52a3a9
Add manual trigger for testing FAQ automation workflow
frederick-douglas-pearce Oct 21, 2025
d38e9c1
Fix issue number handling for manual workflow triggers
frederick-douglas-pearce Oct 22, 2025
a010417
Fix: Create FAQ bot branches from main instead of current branch
frederick-douglas-pearce Oct 22, 2025
e7f7faa
docs: Update Development section with correct uv commands and API key…
frederick-douglas-pearce Oct 23, 2025
efbad4d
fix: Correct uv commands in Makefile and improve issue body parsing
frederick-douglas-pearce Oct 24, 2025
69f4f97
docs: Update README - remove Quick Start and correct LLM to GPT-5
frederick-douglas-pearce Oct 24, 2025
05e3ec2
docs: Update issue links to DataTalksClub/faq repository
frederick-douglas-pearce Oct 24, 2025
b0a5f32
docs: Remove GitHub Discussions reference from Support section
frederick-douglas-pearce Oct 24, 2025
adafd4c
docs: Fix escaped backticks in CONTRIBUTING.md example code blocks
frederick-douglas-pearce Oct 24, 2025
797d741
docs: Fix nested code blocks using 4 backticks for outer block
frederick-douglas-pearce Oct 24, 2025
a5566c5
docs: Update testing documentation links for consistency
frederick-douglas-pearce Oct 24, 2025
df1aa9f
docs: Remove outer code block from example to render link correctly
frederick-douglas-pearce Oct 24, 2025
b652fc2
docs: Add FAQ automation tests to tests/README.md
frederick-douglas-pearce Oct 24, 2025
ca8de38
refactor: Organize FAQ automation tests into classes
frederick-douglas-pearce Oct 24, 2025
e7f6212
docs: Add FAQ automation examples to test method examples
frederick-douglas-pearce Oct 24, 2025
00f2782
docs: Update test commands to match Makefile and remove --extra dev
frederick-douglas-pearce Oct 24, 2025
1032a77
Fix OpenAI API syntax to match notebook implementation
frederick-douglas-pearce Oct 24, 2025
574af38
Replace JS parsing logic with Python in FAQ automation workflow
frederick-douglas-pearce Oct 24, 2025
ce8925a
Replace bash scripting with Python for GitHub Actions outputs
frederick-douglas-pearce Oct 24, 2025
c605fc6
Update workflow to use modern uv sync command
frederick-douglas-pearce Oct 24, 2025
7966260
Simplify FAQ automation workflow by removing --course argument
frederick-douglas-pearce Oct 25, 2025
08d614a
Fix workflow to use uv run for Python commands
frederick-douglas-pearce Oct 25, 2025
e98b56b
Remove dead code: parse_issue_body() and its tests
frederick-douglas-pearce Oct 25, 2025
b1728e7
Add answer content example to FAQ file format section
frederick-douglas-pearce Oct 25, 2025
1341bef
Add course field to FAQ example in contributing guide
frederick-douglas-pearce Oct 25, 2025
c6ff515
Update test documentation to reflect full test suite scope
frederick-douglas-pearce Oct 25, 2025
ad44dc7
Add comprehensive integration tests for FAQ automation workflow
frederick-douglas-pearce Oct 25, 2025
f203777
Add auto-close for FAQ issues when PR is merged
frederick-douglas-pearce Oct 26, 2025
d9e1e82
docs: Document auto-close behavior for FAQ issues when PRs merge
frederick-douglas-pearce Oct 26, 2025
56a99c3
refactor: Remove manual workflow trigger to simplify FAQ automation
frederick-douglas-pearce Oct 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions .github/ISSUE_TEMPLATE/faq-proposal.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
name: FAQ Proposal
description: Propose a new FAQ entry or update to existing FAQ
title: "[FAQ] "
labels: ["faq-proposal"]
body:
- type: markdown
attributes:
value: |
## FAQ Proposal

Thank you for helping improve our FAQ! Please provide a clear question and answer below.

Our FAQ bot will automatically analyze your proposal and determine if it should:
- Create a new FAQ entry
- Update an existing FAQ entry
- Mark as duplicate of an existing FAQ

- type: dropdown
id: course
attributes:
label: Course
description: Which course is this FAQ for?
options:
- machine-learning-zoomcamp
- data-engineering-zoomcamp
- llm-zoomcamp
- mlops-zoomcamp
validations:
required: true

- type: textarea
id: question
attributes:
label: Question
description: What is the FAQ question?
placeholder: "How do I install the required dependencies?"
validations:
required: true

- type: textarea
id: answer
attributes:
label: Answer
description: What is the answer to this question?
placeholder: |
To install the required dependencies, run:
```bash
uv pip install -r requirements.txt
```
validations:
required: true

- type: checkboxes
id: checklist
attributes:
label: Checklist
description: Please confirm the following
options:
- label: I have searched existing FAQs and this question is not already answered
required: true
- label: The answer provides accurate, helpful information
required: true
- label: I have included any relevant code examples or links
required: false
149 changes: 149 additions & 0 deletions .github/workflows/faq-automation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
name: FAQ Automation

on:
issues:
types: [opened]

permissions:
contents: write
issues: write
pull-requests: write

jobs:
process-faq-proposal:
if: contains(github.event.issue.labels.*.name, 'faq-proposal')
runs-on: ubuntu-latest

steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.13'

- name: Install uv
uses: astral-sh/setup-uv@v4

- name: Install dependencies
run: |
uv sync --no-dev

- name: Fetch issue body
id: fetch_issue
uses: actions/github-script@v7
with:
script: |
const body = context.payload.issue.body || '';

// Save issue body to file for Python to parse
const fs = require('fs');
fs.writeFileSync('/tmp/issue_body.txt', body);

console.log('Fetched issue body, saved to /tmp/issue_body.txt');

- name: Process FAQ with AI
id: process
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
run: |
# Run FAQ automation CLI with full issue body
uv run python -m faq_automation.cli \
--issue-body "$(cat /tmp/issue_body.txt)" \
--issue-number ${{ github.event.issue.number }} \
--model "gpt-5-nano" \
--output-dir /tmp

# Write decision output to GitHub Actions
uv run scripts/write_faq_decision_output.py

- name: Handle NEW or UPDATE action
if: fromJson(steps.process.outputs.decision).action != 'DUPLICATE'
uses: actions/github-script@v7
with:
script: |
const decision = ${{ steps.process.outputs.decision }};
const action = decision.action;
const issueNumber = decision.issue_number;
const prBody = decision.pr_body;
const filePath = decision.file_path;

// Create branch name
const branchName = `faq-bot/issue-${issueNumber}`;

// Configure git
await exec.exec('git', ['config', 'user.name', 'FAQ Bot']);
await exec.exec('git', ['config', 'user.email', '[email protected]']);

// Fetch and checkout main, then create new branch from it
await exec.exec('git', ['fetch', 'origin', 'main']);
await exec.exec('git', ['checkout', 'main']);
await exec.exec('git', ['checkout', '-b', branchName]);

// Add modified files
await exec.exec('git', ['add', filePath]);

// Commit changes
const commitMsg = `${action}: ${decision.decision.question.substring(0, 72)}`;
await exec.exec('git', ['commit', '-m', commitMsg]);

// Push branch
await exec.exec('git', ['push', 'origin', branchName]);

// Create pull request
const { data: pr } = await github.rest.pulls.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: `[FAQ Bot] ${action}: ${decision.decision.question.substring(0, 72)}`,
head: branchName,
base: 'main',
body: `${prBody}\n\nCloses #${issueNumber}`
});

// Comment on issue with PR link
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNumber,
body: `βœ… FAQ ${action} proposal created in PR #${pr.number}\n\nPlease review and approve the changes.`
});

- name: Handle DUPLICATE action
if: fromJson(steps.process.outputs.decision).action == 'DUPLICATE'
uses: actions/github-script@v7
with:
script: |
const decision = ${{ steps.process.outputs.decision }};
const comment = decision.comment;
const issueNumber = decision.issue_number;

// Post comment
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNumber,
body: comment
});

// Close issue
await github.rest.issues.update({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNumber,
state: 'closed',
state_reason: 'completed'
});

- name: Handle errors
if: failure()
uses: actions/github-script@v7
with:
script: |
const issueNumber = context.payload.issue.number;
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: issueNumber,
body: `❌ FAQ Bot encountered an error processing this proposal.\n\nPlease check the [workflow run](${context.payload.repository.html_url}/actions/runs/${context.runId}) for details.\n\nA maintainer will review this manually.`
});
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@
_site
__pycache__
.envrc
.ipynb_checkpoints/
.ipynb_checkpoints/
CLAUDE.md
*.egg-info/
162 changes: 162 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Contributing to DataTalks.Club FAQ

Thank you for your interest in contributing to the DataTalks.Club FAQ! This guide will help you understand how to propose new FAQ entries or updates.

## Proposing a New FAQ Entry

We have an automated system that helps maintain the FAQ repository. Here's how to propose a new FAQ entry:

### 1. Create a New Issue

1. Go to the [FAQ Proposal form](https://github.com/DataTalksClub/faq/issues/new?template=faq-proposal.yml)
2. Fill out the form:
- **Course**: Which course this FAQ is for (e.g., `machine-learning-zoomcamp`)
- **Question**: The FAQ question
- **Answer**: A clear, helpful answer with examples if applicable
- Check the validation boxes

### 2. Automated Processing

Once you submit your issue, our FAQ Bot will automatically:

1. **Analyze your proposal** using AI to compare it with existing FAQs
2. **Make a decision**:
- **NEW**: Create a new FAQ entry if the question isn't covered
- **UPDATE**: Update an existing FAQ if your proposal adds valuable context
- **DUPLICATE**: Mark as duplicate if the question is already fully answered

### 3. What Happens Next

#### For NEW or UPDATE Decisions

- A Pull Request will be automatically created with the proposed changes
- The PR will include:
- The new or modified FAQ file(s)
- Explanation of why this action was chosen
- Section placement and reasoning
- A maintainer will review the PR
- Once approved and merged, your FAQ contribution will be live!
- The originating issue will be automatically closed when the PR is merged

#### For DUPLICATE Decisions

- The bot will comment on your issue with:
- Explanation of why it's considered a duplicate
- Link to the existing FAQ that covers your question
- Link to the source file
- The issue will be automatically closed
- If you believe this is incorrect, you can reopen and mention a maintainer

## Writing Good FAQ Entries

### Question Guidelines

- Be specific and clear
- Use the actual words students might search for
- Start with question words (How, What, When, Why, etc.)
- Examples:
- βœ… "How do I install Python dependencies using uv?"
- ❌ "Dependencies"

### Answer Guidelines

- Start with a direct answer
- Include code examples when relevant
- Add links to documentation or resources
- Keep it concise but complete
- Use markdown formatting for readability

**Example:**

### Course
machine-learning-zoomcamp

### Question
How do I run the tests for this project?

### Answer
To run all tests, use:

```bash
make test
```

For unit tests only:

```bash
make test-unit
```

For integration tests only:

```bash
make test-int
```

See the [testing documentation](tests/README.md) for more details.

## Manual Contributions

If you prefer to contribute directly via Pull Request:

1. Fork the repository
2. Create a new branch: `git checkout -b faq/your-topic`
3. Add your FAQ file in the appropriate location:
- `_questions/{course}/{section}/{NNN}_{id}_{slug}.md`
4. Follow the frontmatter format:

```markdown
---
id: abc123
question: 'Your question here?'
sort_order: 10
---

Your answer content here.
```

5. Update `_questions/{course}/_metadata.yaml` if adding a new section
6. Run tests: `make test`
7. Create a Pull Request

## FAQ File Structure

FAQ files are organized as:

```
_questions/
β”œβ”€β”€ machine-learning-zoomcamp/
β”‚ β”œβ”€β”€ _metadata.yaml # Course configuration
β”‚ β”œβ”€β”€ general/
β”‚ β”‚ β”œβ”€β”€ 001_abc123_when-does-course-start.md
β”‚ β”‚ └── 002_def456_what-are-prerequisites.md
β”‚ └── module-1/
β”‚ β”œβ”€β”€ 001_xyz789_install-docker.md
β”‚ └── 002_uvw456_docker-errors.md
└── data-engineering-zoomcamp/
└── ...
```

### Frontmatter Fields

- **id** (required): Unique 10-character identifier
- **question** (required): The FAQ question
- **sort_order** (required): Integer for ordering within section
- **images** (optional): Array of image objects for embedded images

### Markdown Content

- Write the answer in markdown
- Use code blocks with language specifications
- Include links where helpful
- Keep formatting clean and readable

## Questions or Issues?

If you have questions about contributing or encounter issues with the FAQ Bot:

1. Check existing issues for similar questions
2. Create a new issue with the "question" or "bug" label
3. Tag a maintainer if urgent

Thank you for helping improve the DataTalks.Club FAQ! πŸŽ‰
Loading