|
| 1 | +# GitHub Environment Preparation |
| 2 | + |
| 3 | +This guide walks you through preparing your GitHub environment for MCPMark and authenticating the CLI tools with support for **token pooling** to mitigate rate limits. |
| 4 | + |
| 5 | +## 📋 **Table of Contents** |
| 6 | + |
| 7 | +<details> |
| 8 | +<summary><strong>🔍 Quick Overview - Click to Expand</strong></summary> |
| 9 | + |
| 10 | +### **Phase 1: GitHub Setup** |
| 11 | +- [1.1 Create GitHub Organization](#1--prepare-your-evaluation-organization) |
| 12 | +- [1.2 Create Multiple GitHub Accounts (Recommended)](#step-2-create-multiple-github-accounts-recommended) |
| 13 | +- [1.3 Generate Personal Access Tokens (PATs)](#step-3-generate-fine-grained-personal-access-tokens-pats) |
| 14 | +- [1.4 Configure Token Pooling](#step-4-configure-token-pooling-in-mcp_env) |
| 15 | + |
| 16 | +### **Phase 2: Repository State Setup** |
| 17 | +- [2.1 Download Sample Repositories](#2--download-the-sample-repository-state) |
| 18 | +- [2.2 Extract and Verify Structure](#quick-setup) |
| 19 | + |
| 20 | +### **Phase 3: Optional Customization** |
| 21 | +- [3.1 Add Custom Repositories](#3--add-new-repositories-optional) |
| 22 | +- [3.2 Update Configuration Files](#export-process) |
| 23 | + |
| 24 | +### **Phase 4: Understanding Limits** |
| 25 | +- [4.1 GitHub Rate Limits](#4--mitigating-github-rate-limits-with-token-pooling) |
| 26 | +- [4.2 Token Pooling Benefits](#-token-pooling-benefits) |
| 27 | + |
| 28 | +### **Phase 5: Verification & Troubleshooting** |
| 29 | +- [5.1 Quick Checklist](#-quick-checklist) |
| 30 | +- [5.2 Common Issues](#-troubleshooting) |
| 31 | + |
| 32 | +**Total Estimated Time**: 20-30 minutes |
| 33 | +**Difficulty Level**: ⭐⭐⭐☆☆ (Intermediate - requires multiple account setup) |
| 34 | + |
| 35 | +</details> |
| 36 | + |
| 37 | +--- |
| 38 | + |
| 39 | +## 🚨 Critical Requirements |
| 40 | + |
| 41 | +> **⚠️ IMPORTANT**: You must enable **ALL permissions** for both Repository and Organization access. Partial permissions can cause authentication failures. |
| 42 | +
|
| 43 | +## 1 · Prepare Your Evaluation Organization |
| 44 | + |
| 45 | +<<<<<<< fix/documentation |
| 46 | +### Step 1: Create a GitHub Organization |
| 47 | +- **Motivation**: Isolating benchmark repositories from personal codebase. |
| 48 | +- **Action**: In GitHub, click your avatar → **Your organizations** → **New organization** |
| 49 | +- **Naming**: Naming your new organization (e.g., `mcpleague-eval-xxx`), remember to avoid conflicts. |
| 50 | +- **Example**  |
| 51 | + |
| 52 | +### Step 2: Generate Fine-Grained Personal Access Token (PAT) |
| 53 | +- **Navigation**: Settings → Developer settings → Personal access tokens → Fine-grained tokens |
| 54 | +- **Click**: **Generate new token** |
| 55 | +- **Select Owner**: The organization you just created |
| 56 | +- **Name**: Use a descriptive name (e.g., *MCPLeague Eval Token*) |
| 57 | + |
| 58 | +#### 🔑 **CRITICAL PERMISSION SETTINGS** |
| 59 | +- **Repository permissions**: ✅ **Enable ALL permissions (Read and Write if possible)** |
| 60 | +- **Organization permissions**: ✅ **Enable ALL permissions (Read and Write if possible)** |
| 61 | +- **Copy and save your PAT token safely**: This serves the `GITHUB_TOKEN` |
| 62 | + |
| 63 | +**Example** |
| 64 | +-  |
| 65 | +-  |
| 66 | + |
| 67 | +### Step 3: Create Multiple GitHub Accounts (Recommended for Rate Limit Relief) |
| 68 | +To effectively distribute API load and avoid rate limiting, we recommend creating **2-4 additional GitHub accounts**: |
| 69 | + |
| 70 | +#### **Account Setup Process** |
| 71 | +- **Create new accounts**: Naming the accounts with something like `your-name-eval-1`, `your-name-eval-2`, etc. |
| 72 | +- **Add to organization**: Make all accounts **Owners** of your evaluation organization |
| 73 | +- **Generate PATs**: Repeat the token generation process for each account |
| 74 | +- **Token naming**: Use descriptive names like *MCPMark Eval Token - Account 1* |
| 75 | + |
| 76 | +#### **Why Multiple Accounts?** |
| 77 | +- **Rate limit distribution**: Spread API requests across multiple tokens |
| 78 | +- **Automatic failover**: If one account hits limits, others continue working |
| 79 | +- **Performance boost**: 4 tokens = 4x capacity for API operations |
| 80 | + |
| 81 | +### Step 4: Configure Token Pooling in `.mcp_env` |
| 82 | +**File**: Edit (or create) the `.mcp_env` file in your project root |
| 83 | + |
| 84 | +#### **Multiple Tokens Configuration (Recommended)** |
| 85 | +```env |
| 86 | +## GitHub - Token Pooling Configuration |
| 87 | +GITHUB_TOKENS="token1,token2,token3,token4" |
| 88 | +GITHUB_EVAL_ORG="your-eval-org-name" |
| 89 | +``` |
| 90 | + |
| 91 | +#### **Single Token Configuration (Basic Setup)** |
| 92 | +```env |
| 93 | +## GitHub - Single Token Configuration |
| 94 | +GITHUB_TOKENS="your-single-token-here" |
| 95 | +GITHUB_EVAL_ORG="your-eval-org-name" |
| 96 | +``` |
| 97 | + |
| 98 | +#### **Important Configuration Notes** |
| 99 | +- **Token format**: Comma-separated tokens with no spaces |
| 100 | +- **Recommended count**: **2-4 tokens** for optimal rate limit distribution |
| 101 | +- **Permission consistency**: All tokens must have identical permissions on the evaluation organization |
| 102 | +- **Automatic rotation**: The system automatically rotates between tokens to distribute API load |
| 103 | +======= |
| 104 | +1. **Create a free GitHub Organization** |
| 105 | + - In GitHub, click your avatar → **Your organizations** → **New organization**. |
| 106 | + - We recommend a name like `mcpmark-eval-xxx`. (Check if there is a conflict with other organization names.) |
| 107 | + - This keeps all benchmark repositories isolated from your personal and work code. |
| 108 | + -  |
| 109 | + |
| 110 | +2. **Create Multiple GitHub Accounts (Recommended for Rate Limit Relief)** |
| 111 | + To effectively distribute API load and avoid rate limiting, we recommend creating **2-4 additional GitHub accounts**: |
| 112 | + - Create new GitHub accounts (e.g., `your-name-eval-1`, `your-name-eval-2`, etc.) |
| 113 | + - **Important**: Add all these accounts as **Owners** to your evaluation organization |
| 114 | + - This allows the token pooling system to distribute requests across multiple accounts |
| 115 | + |
| 116 | +3. **Generate Fine-Grained Personal Access Tokens (PATs) for Each Account** |
| 117 | + **Repeat this process for each GitHub account (including your main account):** |
| 118 | + - Navigate to *Settings → Developer settings → Personal access tokens → Fine-grained tokens* |
| 119 | + - Click **Generate new token**, select the evaluation organization you created |
| 120 | + - Give the token a descriptive name (e.g., *MCPMark Eval Token - Account 1*) |
| 121 | + - Under **Repository permissions** and **Organization permissions**, enable **All permissions** |
| 122 | + - Copy the generated token — you'll need all tokens for the next step |
| 123 | + -  |
| 124 | + -  |
| 125 | + |
| 126 | +4. **Configure Token Pooling in `.mcp_env`** |
| 127 | + In your project root, edit (or create) the `.mcp_env` file and add your tokens: |
| 128 | + |
| 129 | + **For multiple tokens (Recommended - helps with rate limits):** |
| 130 | + ```env |
| 131 | + ## GitHub - Token Pooling Configuration |
| 132 | + GITHUB_TOKENS="token1,token2,token3,token4" |
| 133 | + GITHUB_EVAL_ORG="your-eval-org-name" |
| 134 | + ``` |
| 135 | + |
| 136 | + **For single token (Basic setup):** |
| 137 | + ```env |
| 138 | + ## GitHub - Single Token Configuration |
| 139 | + GITHUB_TOKENS="your-single-token-here" |
| 140 | + GITHUB_EVAL_ORG="your-eval-org-name" |
| 141 | + ``` |
| 142 | +>>>>>>> main |
| 143 | +
|
| 144 | + **Important Notes:** |
| 145 | + - Replace `token1,token2,token3,token4` with your actual tokens (comma-separated, no spaces) |
| 146 | + - We recommend **2-4 tokens** for optimal rate limit distribution |
| 147 | + - All tokens must have the same permissions on the evaluation organization |
| 148 | + - The system automatically rotates between tokens to distribute API load |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | + |
| 153 | +## 2 · Download the Sample Repository State |
| 154 | + |
| 155 | +We have pre-exported several popular open-source repositories along with curated Issues and PRs. |
| 156 | + |
| 157 | +<<<<<<< fix/documentation |
| 158 | +### Quick Setup |
| 159 | +1. **Download**: Find the code archive from [Google Drive](https://drive.google.com/your-link-here) |
| 160 | +2. **Extract**: Exact the zip file and place the `./github_state/` directory in your project root |
| 161 | + |
| 162 | + |
| 163 | +**Command**: |
| 164 | +```bash |
| 165 | +mkdir -p github_state |
| 166 | +unzip mcpleague_github_state.zip -d ./github_state |
| 167 | +``` |
| 168 | +======= |
| 169 | +1. Download the archive from [Google Drive](https://drive.google.com/drive/folders/16bFDjdtqJYzYJlqKcjKBGomo8DwOhWcN?usp=drive_link). |
| 170 | +2. Extract it so that the directory `./github_state/` appears in the project root: |
| 171 | + ```bash |
| 172 | + mkdir -p github_state |
| 173 | + unzip github_state.zip -d ./github_state |
| 174 | + ``` |
| 175 | +>>>>>>> main |
| 176 | +
|
| 177 | +--- |
| 178 | + |
| 179 | +## 3 · Add New Repositories (Optional) |
| 180 | + |
| 181 | +If you want to benchmark additional repositories: |
| 182 | + |
| 183 | +### Export Process |
| 184 | +1. **Export repository state**: |
| 185 | + ```bash |
| 186 | + python -m src.mcp_services.github.repo_exporter --repo owner/name --out ./github_state/{your_repo_name} |
| 187 | + ``` |
| 188 | + |
| 189 | +2. **Update configuration**: |
| 190 | + - **File**: Open `src/mcp_services/github/state_manager.py` |
| 191 | + - **Action**: Add a new entry to `self.initial_state_mapping` pointing to the exported folder |
| 192 | + |
| 193 | +--- |
| 194 | + |
| 195 | +<<<<<<< fix/documentation |
| 196 | +## 4 · Mitigating GitHub Rate Limits with Token Pooling |
| 197 | + |
| 198 | +### 📊 **Understanding Rate Limits** |
| 199 | + |
| 200 | +Fine-grained tokens are subject to GitHub API rate limits: |
| 201 | + |
| 202 | +#### **Rate Limit Overview** |
| 203 | +- **Read operations**: 5,000 requests per hour per token |
| 204 | +- **General write operations**: 80 writes per minute and 500 writes per hour per token |
| 205 | +- **Content creation** (Issues, PRs, Comments): 500 requests per hour per token (Secondary Rate Limit) |
| 206 | + |
| 207 | +### 🚀 **Token Pooling Benefits** |
| 208 | + |
| 209 | +MCPMark automatically distributes requests across multiple tokens: |
| 210 | + |
| 211 | +- **Rate limit multiplication**: 4 tokens = 4x capacity |
| 212 | +- **Automatic failover**: If one token hits limits, others continue working |
| 213 | +- **Load balancing**: Rotates tokens for optimal performance |
| 214 | + |
| 215 | +### 📈 **Capacity Examples** |
| 216 | + |
| 217 | +- **Read operations**: 5,000 → 20,000 requests/hour (with 4 tokens) |
| 218 | +- **Content creation**: 500 → 2,000 requests/hour (with 4 tokens) |
| 219 | + |
| 220 | +### 💡 **Key Benefits** |
| 221 | + |
| 222 | +- **Faster evaluations**: Handle large task batches without hitting rate limits |
| 223 | +- **Reliable performance**: Automatic failover ensures continuous operation |
| 224 | +- **Scalable testing**: Run more frequent evaluations and larger test suites |
| 225 | + |
| 226 | +### ⚠️ **Repository Limits** |
| 227 | + |
| 228 | +**MCPMark caps each repository at ≤ 20 Issues and ≤ 10 PRs by default** to ensure reasonable evaluation times while staying within rate limits. |
| 229 | + |
| 230 | +--- |
| 231 | + |
| 232 | +## 🎯 **Quick Checklist** |
| 233 | + |
| 234 | +Before proceeding, ensure you have: |
| 235 | +- [ ] Created GitHub organization (`mcpleague-eval-xxx`) |
| 236 | +- [ ] Created 2-4 additional GitHub accounts for token pooling |
| 237 | +- [ ] Added all accounts as Owners to your evaluation organization |
| 238 | +- [ ] Generated PATs with **ALL permissions enabled** for each account |
| 239 | +- [ ] Added `GITHUB_TOKENS` and `GITHUB_EVAL_ORG` to `.mcp_env` |
| 240 | +- [ ] Downloaded and extracted `github_state/` directory |
| 241 | +- [ ] Verified network connectivity to `api.github.com` |
| 242 | + |
| 243 | +## 🆘 **Troubleshooting** |
| 244 | + |
| 245 | +### Common Issues |
| 246 | +- **Authentication failed**: Ensure **ALL permissions** are enabled (not just read) |
| 247 | +- **Token pooling not working**: Verify all tokens have identical permissions and are comma-separated |
| 248 | +- **Rate limit still hit**: Check that you have 2-4 tokens configured for optimal distribution |
| 249 | +- **Network timeout**: Check firewall settings or VPN configuration |
| 250 | +- **Rate limit exceeded**: Wait for the hourly limit to reset or add more tokens to your pool |
| 251 | +======= |
| 252 | +## 4 · GitHub Rate Limits & Token Pooling Benefits |
| 253 | + |
| 254 | +### Understanding Rate Limits |
| 255 | +Fine-grained tokens are subject to GitHub API rate limits: |
| 256 | +- **Read operations**: 5,000 requests per hour per token |
| 257 | +- **General write operations**: 80 writes per minute and 500 writes per hour per token |
| 258 | +- **Content creation (Issues, PRs, Comments)**: **500 requests per hour per token** (Secondary Rate Limit) |
| 259 | + |
| 260 | +### How Token Pooling Helps |
| 261 | +With **token pooling**, MCPMark automatically: |
| 262 | +- **Distributes requests** across multiple tokens to multiply your rate limits |
| 263 | +- **Rotates tokens** for each task execution to balance load |
| 264 | +- **Handles rate limit failures** by trying the next available token |
| 265 | +- **Ensures consistency** between agent execution and verification |
| 266 | + |
| 267 | +### Example: Rate Limit Multiplication |
| 268 | +**Read Operations:** |
| 269 | +- **Single token**: 5,000 requests/hour |
| 270 | +- **4 tokens**: ~20,000 requests/hour total capacity |
| 271 | + |
| 272 | +**Content Creation (Critical for MCPMark):** |
| 273 | +- **Single token**: 500 content creation requests/hour |
| 274 | +- **4 tokens**: ~2,000 content creation requests/hour total capacity |
| 275 | +- **Automatic failover**: If one token hits limits, others continue working |
| 276 | + |
| 277 | +This dramatically improves evaluation performance, especially for large task batches or frequent testing cycles. **The content creation limit is often the bottleneck**, making token pooling essential for efficient evaluations. |
| 278 | + |
| 279 | +### Repository Limits |
| 280 | +MCPMark places a cap on the number of PRs and issues (≤ 50 in total) per repository to ensure reasonable evaluation times and to stay within rate limits. |
0 commit comments