-
Notifications
You must be signed in to change notification settings - Fork 8
Contributing Guide
Thank you for your interest in contributing to mcpbr! We welcome contributions from everyone.
Found a bug? Open an issue with:
- Clear description of the problem
- Steps to reproduce
- Expected vs actual behavior
- Your environment (OS, Python version, Docker version)
- Relevant logs or error messages
Have an idea? Open a feature request with:
- Clear description of the feature
- Use case or problem it solves
- Proposed implementation (optional)
- Examples or mockups (if applicable)
Documentation improvements are always welcome:
- Fix typos or clarify confusing sections
- Add examples or tutorials
- Improve API documentation
- Translate documentation (future)
Fix bugs, implement features, or improve performance:
- Check good first issues
- Look at help wanted issues
- Review the roadmap
-
Fork and clone the repository
git clone https://github.com/YOUR_USERNAME/mcpbr.git cd mcpbr -
Create a virtual environment
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install development dependencies
pip install -e ".[dev]" -
Verify installation
pytest -m "not integration" # Run unit tests
See Development Setup for detailed instructions.
git checkout -b feature/my-awesome-feature
# OR
git checkout -b fix/issue-123-bug-descriptionBranch Naming:
-
feature/- New features -
fix/- Bug fixes -
docs/- Documentation changes -
refactor/- Code refactoring -
test/- Test additions or fixes
Code Style:
- Follow PEP 8
- Use type hints
- Write docstrings (Google style)
- Keep functions focused and small
Example:
def calculate_improvement(mcp_rate: float, baseline_rate: float) -> float:
"""Calculate percentage improvement of MCP over baseline.
Args:
mcp_rate: MCP agent resolution rate (0.0 to 1.0)
baseline_rate: Baseline agent resolution rate (0.0 to 1.0)
Returns:
Percentage improvement (e.g., 0.6 for 60% improvement)
Raises:
ValueError: If baseline_rate is 0
"""
if baseline_rate == 0:
raise ValueError("Baseline rate cannot be zero")
return (mcp_rate - baseline_rate) / baseline_rateRequired:
- Unit tests for new functions/classes
- Integration tests for new features
- Update existing tests if behavior changes
# tests/test_reporting.py
def test_calculate_improvement():
"""Test improvement calculation."""
assert calculate_improvement(0.32, 0.20) == 0.6
assert calculate_improvement(0.20, 0.20) == 0.0
with pytest.raises(ValueError):
calculate_improvement(0.32, 0.0)Run tests:
# Run all tests
pytest
# Run only unit tests
pytest -m "not integration"
# Run with coverage
pytest --cov=src/mcpbr --cov-report=html# Check for issues
ruff check src/
# Auto-fix issues
ruff check --fix src/
# Format code
ruff format src/- Add docstrings to new functions/classes
- Update README if adding new features
- Add examples to relevant docs
- Update CLI help text if changing commands
Commit Message Format:
type(scope): short description
Longer description if needed
Fixes #123
Types:
-
feat: New feature -
fix: Bug fix -
docs: Documentation changes -
test: Test additions or fixes -
refactor: Code refactoring -
perf: Performance improvements -
chore: Build/tooling changes
Examples:
git commit -m "feat(benchmarks): add MCPToolBench++ support"
git commit -m "fix(docker): handle missing pre-built images gracefully"
git commit -m "docs(wiki): add contributing guide"git push origin feature/my-awesome-featureThen open a pull request on GitHub with:
- Clear description of changes
- Link to related issues (Fixes #123)
- Screenshots/examples if applicable
- Checklist of completed items
Before submitting your PR, ensure:
- Code follows project style (ruff passes)
- All tests pass (including new tests)
- Documentation updated if needed
- Commit messages follow format
- PR description is clear and complete
- Branch is up-to-date with main
- No merge conflicts
- Changes are focused (one feature/fix per PR)
New to the project? Start here:
- Add CSV output format
- Improve error messages
- Add CLI help examples
- Fix typos in documentation
- Add YAML output format
- Implement config validation
- Add shell completion scripts
- Create benchmark comparison report
- Implement tool coverage analysis
- Add performance profiling
- Create interactive config wizard
- Add new benchmark support
See the full list: Good First Issues
Understanding the codebase:
src/mcpbr/
├── cli.py # Command-line interface
├── config.py # Configuration models
├── models.py # Model registry
├── providers.py # LLM provider abstractions
├── harnesses.py # Agent harness implementations
├── benchmarks/ # Benchmark abstraction layer
│ ├── __init__.py # Registry and factory
│ ├── base.py # Benchmark protocol
│ ├── swebench.py # SWE-bench implementation
│ ├── cybergym.py # CyberGym implementation
│ └── terminalbench.py # TerminalBench implementation
├── harness.py # Main orchestrator
├── docker_env.py # Docker environment management
├── evaluation.py # Patch application and testing
├── log_formatter.py # Log formatting
└── reporting.py # Output formatting
See Architecture for detailed explanation.
tests/
├── test_config.py # Configuration tests
├── test_benchmarks.py # Benchmark tests
├── test_docker_env.py # Docker tests
├── test_evaluation.py # Evaluation tests
├── test_reporting.py # Reporting tests
├── test_cli.py # CLI tests
└── test_integration.py # Integration tests
DO:
- Test one thing per test function
- Use descriptive test names
- Include edge cases and error conditions
- Use fixtures for common setup
- Mock external dependencies (API calls, Docker)
DON'T:
- Test implementation details
- Write flaky tests
- Skip cleanup (use fixtures with teardown)
- Commit tests that depend on external state
@pytest.mark.integration # Requires Docker, API keys
@pytest.mark.slow # Takes >10 seconds
@pytest.mark.unit # Fast, no external dependenciesRun specific markers:
pytest -m "not integration" # Skip integration tests
pytest -m slow # Only slow tests- Correctness: Does it work? Are there bugs?
- Tests: Are there sufficient tests?
- Style: Does it follow project conventions?
- Documentation: Is it well documented?
- Performance: Are there performance issues?
- Security: Are there security concerns?
- Be receptive to feedback
- Ask questions if unclear
- Make requested changes promptly
- Mark conversations as resolved when addressed
- Thank reviewers for their time
- Initial review: Usually within 2-3 days
- Follow-up reviews: 1-2 days
- Merge: When all feedback addressed and approved
When contributing, keep these principles in mind:
- Use the simplest solution that works
- Don't over-engineer
- Prefer composition over inheritance
- Use protocols for abstraction
- Make it easy to add new benchmarks/providers
- Design for pluggability
- Clear error messages
- Helpful documentation
- Sensible defaults
- Progressive disclosure
- Minimize Docker overhead
- Use async where appropriate
- Cache when possible
- Profile before optimizing
- Handle errors gracefully
- Validate inputs
- Provide meaningful feedback
- Clean up resources
We are committed to providing a welcoming and inclusive environment. Please read our Code of Conduct.
- Be respectful: Treat everyone with respect
- Be constructive: Provide actionable feedback
- Be patient: Remember everyone is learning
- Be inclusive: Welcome newcomers
- Be transparent: Communicate openly
- 💬 Discussions - Ask questions, share ideas
- 💡 [Discord/Slack] (Coming soon) - Real-time chat
- 📧 Email: [project email] (for sensitive matters)
Contributors are recognized in:
- README contributors section
- Release notes
- GitHub profile contributions
- Special thanks in major releases
Top contributors may be invited to join the core team!
By contributing, you agree that your contributions will be licensed under the MIT License.
Still have questions? We're here to help!
- 💬 Ask in Discussions
- 📧 Contact maintainers
- 📖 Read the FAQ
Thank you for contributing to mcpbr! 🎉