Skip to content

Commit 90a28f4

Browse files
authored
Fix/documentation improve github setup documentation (#72)
1 parent b915345 commit 90a28f4

File tree

1,461 files changed

+374649
-78
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

1,461 files changed

+374649
-78
lines changed

docs/setup/github-env-readme.md

Lines changed: 280 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,280 @@
1+
# GitHub Environment Preparation
2+
3+
This guide walks you through preparing your GitHub environment for MCPMark and authenticating the CLI tools with support for **token pooling** to mitigate rate limits.
4+
5+
## 📋 **Table of Contents**
6+
7+
<details>
8+
<summary><strong>🔍 Quick Overview - Click to Expand</strong></summary>
9+
10+
### **Phase 1: GitHub Setup**
11+
- [1.1 Create GitHub Organization](#1--prepare-your-evaluation-organization)
12+
- [1.2 Create Multiple GitHub Accounts (Recommended)](#step-2-create-multiple-github-accounts-recommended)
13+
- [1.3 Generate Personal Access Tokens (PATs)](#step-3-generate-fine-grained-personal-access-tokens-pats)
14+
- [1.4 Configure Token Pooling](#step-4-configure-token-pooling-in-mcp_env)
15+
16+
### **Phase 2: Repository State Setup**
17+
- [2.1 Download Sample Repositories](#2--download-the-sample-repository-state)
18+
- [2.2 Extract and Verify Structure](#quick-setup)
19+
20+
### **Phase 3: Optional Customization**
21+
- [3.1 Add Custom Repositories](#3--add-new-repositories-optional)
22+
- [3.2 Update Configuration Files](#export-process)
23+
24+
### **Phase 4: Understanding Limits**
25+
- [4.1 GitHub Rate Limits](#4--mitigating-github-rate-limits-with-token-pooling)
26+
- [4.2 Token Pooling Benefits](#-token-pooling-benefits)
27+
28+
### **Phase 5: Verification & Troubleshooting**
29+
- [5.1 Quick Checklist](#-quick-checklist)
30+
- [5.2 Common Issues](#-troubleshooting)
31+
32+
**Total Estimated Time**: 20-30 minutes
33+
**Difficulty Level**: ⭐⭐⭐☆☆ (Intermediate - requires multiple account setup)
34+
35+
</details>
36+
37+
---
38+
39+
## 🚨 Critical Requirements
40+
41+
> **⚠️ IMPORTANT**: You must enable **ALL permissions** for both Repository and Organization access. Partial permissions can cause authentication failures.
42+
43+
## 1 · Prepare Your Evaluation Organization
44+
45+
<<<<<<< fix/documentation
46+
### Step 1: Create a GitHub Organization
47+
- **Motivation**: Isolating benchmark repositories from personal codebase.
48+
- **Action**: In GitHub, click your avatar → **Your organizations****New organization**
49+
- **Naming**: Naming your new organization (e.g., `mcpleague-eval-xxx`), remember to avoid conflicts.
50+
- **Example** ![Create Org](../../asset/github/github_create_org.png)
51+
52+
### Step 2: Generate Fine-Grained Personal Access Token (PAT)
53+
- **Navigation**: Settings → Developer settings → Personal access tokens → Fine-grained tokens
54+
- **Click**: **Generate new token**
55+
- **Select Owner**: The organization you just created
56+
- **Name**: Use a descriptive name (e.g., *MCPLeague Eval Token*)
57+
58+
#### 🔑 **CRITICAL PERMISSION SETTINGS**
59+
- **Repository permissions**: ✅ **Enable ALL permissions (Read and Write if possible)**
60+
- **Organization permissions**: ✅ **Enable ALL permissions (Read and Write if possible)**
61+
- **Copy and save your PAT token safely**: This serves the `GITHUB_TOKEN`
62+
63+
**Example**
64+
- ![Create Token](../../asset/github/github_create_token.png)
65+
- ![Token Permissions](../../asset/github/github_token_permissions.png)
66+
67+
### Step 3: Create Multiple GitHub Accounts (Recommended for Rate Limit Relief)
68+
To effectively distribute API load and avoid rate limiting, we recommend creating **2-4 additional GitHub accounts**:
69+
70+
#### **Account Setup Process**
71+
- **Create new accounts**: Naming the accounts with something like `your-name-eval-1`, `your-name-eval-2`, etc.
72+
- **Add to organization**: Make all accounts **Owners** of your evaluation organization
73+
- **Generate PATs**: Repeat the token generation process for each account
74+
- **Token naming**: Use descriptive names like *MCPMark Eval Token - Account 1*
75+
76+
#### **Why Multiple Accounts?**
77+
- **Rate limit distribution**: Spread API requests across multiple tokens
78+
- **Automatic failover**: If one account hits limits, others continue working
79+
- **Performance boost**: 4 tokens = 4x capacity for API operations
80+
81+
### Step 4: Configure Token Pooling in `.mcp_env`
82+
**File**: Edit (or create) the `.mcp_env` file in your project root
83+
84+
#### **Multiple Tokens Configuration (Recommended)**
85+
```env
86+
## GitHub - Token Pooling Configuration
87+
GITHUB_TOKENS="token1,token2,token3,token4"
88+
GITHUB_EVAL_ORG="your-eval-org-name"
89+
```
90+
91+
#### **Single Token Configuration (Basic Setup)**
92+
```env
93+
## GitHub - Single Token Configuration
94+
GITHUB_TOKENS="your-single-token-here"
95+
GITHUB_EVAL_ORG="your-eval-org-name"
96+
```
97+
98+
#### **Important Configuration Notes**
99+
- **Token format**: Comma-separated tokens with no spaces
100+
- **Recommended count**: **2-4 tokens** for optimal rate limit distribution
101+
- **Permission consistency**: All tokens must have identical permissions on the evaluation organization
102+
- **Automatic rotation**: The system automatically rotates between tokens to distribute API load
103+
=======
104+
1. **Create a free GitHub Organization**
105+
- In GitHub, click your avatar → **Your organizations****New organization**.
106+
- We recommend a name like `mcpmark-eval-xxx`. (Check if there is a conflict with other organization names.)
107+
- This keeps all benchmark repositories isolated from your personal and work code.
108+
- ![Create Org](../../asset/github/github_create_org.png)
109+
110+
2. **Create Multiple GitHub Accounts (Recommended for Rate Limit Relief)**
111+
To effectively distribute API load and avoid rate limiting, we recommend creating **2-4 additional GitHub accounts**:
112+
- Create new GitHub accounts (e.g., `your-name-eval-1`, `your-name-eval-2`, etc.)
113+
- **Important**: Add all these accounts as **Owners** to your evaluation organization
114+
- This allows the token pooling system to distribute requests across multiple accounts
115+
116+
3. **Generate Fine-Grained Personal Access Tokens (PATs) for Each Account**
117+
**Repeat this process for each GitHub account (including your main account):**
118+
- Navigate to *Settings → Developer settings → Personal access tokens → Fine-grained tokens*
119+
- Click **Generate new token**, select the evaluation organization you created
120+
- Give the token a descriptive name (e.g., *MCPMark Eval Token - Account 1*)
121+
- Under **Repository permissions** and **Organization permissions**, enable **All permissions**
122+
- Copy the generated token — you'll need all tokens for the next step
123+
- ![Create Token](../../asset/github/github_create_token.png)
124+
- ![Token Permissions](../../asset/github/github_token_permissions.png)
125+
126+
4. **Configure Token Pooling in `.mcp_env`**
127+
In your project root, edit (or create) the `.mcp_env` file and add your tokens:
128+
129+
**For multiple tokens (Recommended - helps with rate limits):**
130+
```env
131+
## GitHub - Token Pooling Configuration
132+
GITHUB_TOKENS="token1,token2,token3,token4"
133+
GITHUB_EVAL_ORG="your-eval-org-name"
134+
```
135+
136+
**For single token (Basic setup):**
137+
```env
138+
## GitHub - Single Token Configuration
139+
GITHUB_TOKENS="your-single-token-here"
140+
GITHUB_EVAL_ORG="your-eval-org-name"
141+
```
142+
>>>>>>> main
143+
144+
**Important Notes:**
145+
- Replace `token1,token2,token3,token4` with your actual tokens (comma-separated, no spaces)
146+
- We recommend **2-4 tokens** for optimal rate limit distribution
147+
- All tokens must have the same permissions on the evaluation organization
148+
- The system automatically rotates between tokens to distribute API load
149+
150+
---
151+
152+
153+
## 2 · Download the Sample Repository State
154+
155+
We have pre-exported several popular open-source repositories along with curated Issues and PRs.
156+
157+
<<<<<<< fix/documentation
158+
### Quick Setup
159+
1. **Download**: Find the code archive from [Google Drive](https://drive.google.com/your-link-here)
160+
2. **Extract**: Exact the zip file and place the `./github_state/` directory in your project root
161+
162+
163+
**Command**:
164+
```bash
165+
mkdir -p github_state
166+
unzip mcpleague_github_state.zip -d ./github_state
167+
```
168+
=======
169+
1. Download the archive from [Google Drive](https://drive.google.com/drive/folders/16bFDjdtqJYzYJlqKcjKBGomo8DwOhWcN?usp=drive_link).
170+
2. Extract it so that the directory `./github_state/` appears in the project root:
171+
```bash
172+
mkdir -p github_state
173+
unzip github_state.zip -d ./github_state
174+
```
175+
>>>>>>> main
176+
177+
---
178+
179+
## 3 · Add New Repositories (Optional)
180+
181+
If you want to benchmark additional repositories:
182+
183+
### Export Process
184+
1. **Export repository state**:
185+
```bash
186+
python -m src.mcp_services.github.repo_exporter --repo owner/name --out ./github_state/{your_repo_name}
187+
```
188+
189+
2. **Update configuration**:
190+
- **File**: Open `src/mcp_services/github/state_manager.py`
191+
- **Action**: Add a new entry to `self.initial_state_mapping` pointing to the exported folder
192+
193+
---
194+
195+
<<<<<<< fix/documentation
196+
## 4 · Mitigating GitHub Rate Limits with Token Pooling
197+
198+
### 📊 **Understanding Rate Limits**
199+
200+
Fine-grained tokens are subject to GitHub API rate limits:
201+
202+
#### **Rate Limit Overview**
203+
- **Read operations**: 5,000 requests per hour per token
204+
- **General write operations**: 80 writes per minute and 500 writes per hour per token
205+
- **Content creation** (Issues, PRs, Comments): 500 requests per hour per token (Secondary Rate Limit)
206+
207+
### 🚀 **Token Pooling Benefits**
208+
209+
MCPMark automatically distributes requests across multiple tokens:
210+
211+
- **Rate limit multiplication**: 4 tokens = 4x capacity
212+
- **Automatic failover**: If one token hits limits, others continue working
213+
- **Load balancing**: Rotates tokens for optimal performance
214+
215+
### 📈 **Capacity Examples**
216+
217+
- **Read operations**: 5,000 → 20,000 requests/hour (with 4 tokens)
218+
- **Content creation**: 500 → 2,000 requests/hour (with 4 tokens)
219+
220+
### 💡 **Key Benefits**
221+
222+
- **Faster evaluations**: Handle large task batches without hitting rate limits
223+
- **Reliable performance**: Automatic failover ensures continuous operation
224+
- **Scalable testing**: Run more frequent evaluations and larger test suites
225+
226+
### ⚠️ **Repository Limits**
227+
228+
**MCPMark caps each repository at ≤ 20 Issues and ≤ 10 PRs by default** to ensure reasonable evaluation times while staying within rate limits.
229+
230+
---
231+
232+
## 🎯 **Quick Checklist**
233+
234+
Before proceeding, ensure you have:
235+
- [ ] Created GitHub organization (`mcpleague-eval-xxx`)
236+
- [ ] Created 2-4 additional GitHub accounts for token pooling
237+
- [ ] Added all accounts as Owners to your evaluation organization
238+
- [ ] Generated PATs with **ALL permissions enabled** for each account
239+
- [ ] Added `GITHUB_TOKENS` and `GITHUB_EVAL_ORG` to `.mcp_env`
240+
- [ ] Downloaded and extracted `github_state/` directory
241+
- [ ] Verified network connectivity to `api.github.com`
242+
243+
## 🆘 **Troubleshooting**
244+
245+
### Common Issues
246+
- **Authentication failed**: Ensure **ALL permissions** are enabled (not just read)
247+
- **Token pooling not working**: Verify all tokens have identical permissions and are comma-separated
248+
- **Rate limit still hit**: Check that you have 2-4 tokens configured for optimal distribution
249+
- **Network timeout**: Check firewall settings or VPN configuration
250+
- **Rate limit exceeded**: Wait for the hourly limit to reset or add more tokens to your pool
251+
=======
252+
## 4 · GitHub Rate Limits & Token Pooling Benefits
253+
254+
### Understanding Rate Limits
255+
Fine-grained tokens are subject to GitHub API rate limits:
256+
- **Read operations**: 5,000 requests per hour per token
257+
- **General write operations**: 80 writes per minute and 500 writes per hour per token
258+
- **Content creation (Issues, PRs, Comments)**: **500 requests per hour per token** (Secondary Rate Limit)
259+
260+
### How Token Pooling Helps
261+
With **token pooling**, MCPMark automatically:
262+
- **Distributes requests** across multiple tokens to multiply your rate limits
263+
- **Rotates tokens** for each task execution to balance load
264+
- **Handles rate limit failures** by trying the next available token
265+
- **Ensures consistency** between agent execution and verification
266+
267+
### Example: Rate Limit Multiplication
268+
**Read Operations:**
269+
- **Single token**: 5,000 requests/hour
270+
- **4 tokens**: ~20,000 requests/hour total capacity
271+
272+
**Content Creation (Critical for MCPMark):**
273+
- **Single token**: 500 content creation requests/hour
274+
- **4 tokens**: ~2,000 content creation requests/hour total capacity
275+
- **Automatic failover**: If one token hits limits, others continue working
276+
277+
This dramatically improves evaluation performance, especially for large task batches or frequent testing cycles. **The content creation limit is often the bottleneck**, making token pooling essential for efficient evaluations.
278+
279+
### Repository Limits
280+
MCPMark places a cap on the number of PRs and issues (≤ 50 in total) per repository to ensure reasonable evaluation times and to stay within rate limits.

0 commit comments

Comments
 (0)