|
1 | 1 | # Email Processing Module
|
2 | 2 |
|
3 | 3 | ## Overview
|
4 |
| -Handles IMAP email retrieval and routing for two inboxes: |
5 |
| -1. `pacenotefoo@caf-gpt.com` - Routes to Pace Notes system |
6 |
| -2. `policyfoo@caf-gpt.com` - Routes to Policy Foo system |
| 4 | +Handles IMAP email retrieval and routing for two specific folders in ProtonMail: |
| 5 | +1. `CAF-GPT/PaceNoteFoo` - Routes to Pace Notes system |
| 6 | +2. `CAF-GPT/PolicyFoo` - Routes to Policy Foo system |
7 | 7 |
|
8 | 8 | ## Architecture
|
9 | 9 |
|
10 | 10 | ### Data Flow
|
11 |
| -0. Initialization |
| 11 | +1. Initialization |
12 | 12 | - On startup: Connect to IMAP
|
13 |
| - - Reload all unread messages into queue |
14 |
| - - Maintain original received order |
| 13 | + - Select specific mailboxes for processing |
| 14 | + - Load configuration from environment |
| 15 | + - Setup logging based on development mode |
15 | 16 |
|
16 |
| -1. Queue Management |
17 |
| - - Thread-safe in-memory storage using Python's deque |
18 |
| - - Cold start protection via IMAP reload |
19 |
| - - No persistent storage of queue state |
20 |
| - - Lock-based concurrency control |
| 17 | +2. Connection Management |
| 18 | + - Single IMAP connection with health monitoring |
| 19 | + - Automatic reconnection with exponential backoff |
| 20 | + - Connection status tracking |
| 21 | + - Clean error handling |
21 | 22 |
|
22 |
| -2. Processing Flow |
23 |
| - - Emails added to queue as received |
24 |
| - - **Routing Logic**: |
25 |
| - - Add `system` metadata key based on `To:` field: |
26 |
| - - `pacenotefoo@caf-gpt.com` → `pace_notes` |
27 |
| - - `policyfoo@caf-gpt.com` → `policy_foo` |
28 |
| - - Unknown recipients are logged and skipped |
29 |
| - - Async processing loop with health checks |
30 |
| - - Success confirmation required before next item |
31 |
| - - Mark email as read after processing |
32 |
| - - Failed emails are retried with exponential backoff |
| 23 | +3. Email Processing |
| 24 | + - Continuous processing loop |
| 25 | + - Mailbox-specific routing (pace_notes/policy_foo) |
| 26 | + - Mark-as-read confirmation |
| 27 | + - Error handling with logging |
33 | 28 |
|
34 |
| -## Technical Implementation |
| 29 | +### Implementation Details |
35 | 30 |
|
36 |
| -### Environment Variables |
37 |
| -``` |
38 |
| -EMAIL_HOST=127.0.0.1 |
| 31 | +#### Configuration |
| 32 | +```python |
| 33 | +# Environment Variables |
| 34 | +EMAIL_HOST=100.99.136.75 |
39 | 35 | EMAIL_PASSWORD=****
|
40 | 36 | IMAP_PORT=1143
|
41 | 37 | SMTP_PORT=1025
|
42 |
| -``` |
43 |
| - |
44 |
| -### System Design |
45 |
| -- **Queue Characteristics**: |
46 |
| - - Pure Python deque with maxlen=100 |
47 |
| - - Thread-safe operations |
48 |
| - - Messages stored as EmailMessage objects |
49 |
| - - Order preserved from IMAP UID sequence |
50 | 38 |
|
51 |
| -- **Connection Management**: |
52 |
| - - Single IMAP connection with health monitoring |
53 |
| - - Automatic reconnection with exponential backoff |
54 |
| - - Connection status exposed via health check |
55 |
| - |
56 |
| -- **Health Monitoring**: |
57 |
| - - Queue statistics (size, processing state) |
58 |
| - - Connection health (status, retry count) |
59 |
| - - Integrated with FastAPI health check endpoint |
| 39 | +# Hardcoded Mailboxes |
| 40 | +MAILBOXES = { |
| 41 | + "pace_notes": "CAF-GPT/PaceNoteFoo", |
| 42 | + "policy_foo": "CAF-GPT/PolicyFoo" |
| 43 | +} |
| 44 | +``` |
60 | 45 |
|
61 |
| -### Message Parsing |
62 |
| -- **Headers Extracted**: |
63 |
| - - `From`: Sender's email address |
64 |
| - - `To`: Recipient address(es) |
65 |
| - - `Subject`: Email subject line |
66 |
| - - `Date`: Received timestamp |
67 |
| -- **Body Handling**: |
68 |
| - - Only process `text/plain` content |
69 |
| - - Ignore HTML and attachments |
| 46 | +### Mailbox Handling |
| 47 | +- **Folder Selection**: Explicitly select each mailbox before processing |
| 48 | +- **Folder Switching**: Switch between mailboxes during processing loop |
| 49 | +- **Folder Monitoring**: Track last processed message for each folder |
| 50 | +- **Error Handling**: Handle folder access errors gracefully |
70 | 51 |
|
71 | 52 | ### Error Handling
|
72 |
| -- **IMAP Connection Failures**: |
73 |
| - - Exponential backoff (1s to 1 hour) |
74 |
| - - Health status monitoring |
75 |
| - - Automatic reconnection attempts |
| 53 | +- Connection failures with backoff |
| 54 | +- Folder access errors with logging |
| 55 | +- Graceful shutdown on interrupts |
| 56 | +- Development mode detailed logging |
76 | 57 |
|
77 |
| -- **Processing Errors**: |
78 |
| - - Failed messages marked for retry |
79 |
| - - Retry count tracking |
80 |
| - - Error reason logging |
| 58 | +### Health Monitoring |
| 59 | +- **Connection status**: |
| 60 | + - Last successful connection time |
| 61 | + - Connection error count |
| 62 | + - Current connection state |
| 63 | +- **Queue statistics**: |
| 64 | + - Current queue size |
| 65 | + - Messages processed |
| 66 | + - Messages failed |
| 67 | + - Average processing time |
| 68 | +- **System metrics**: |
| 69 | + - CPU/memory usage |
| 70 | + - Thread count |
| 71 | + - Active connections |
| 72 | +- **Alerting**: |
| 73 | + - Email processing failures |
| 74 | + - Queue capacity warnings |
| 75 | + - Connection errors |
81 | 76 |
|
82 |
| -- **Queue Management**: |
83 |
| - - Full queue handling (drop new messages) |
84 |
| - - Thread-safe operations |
| 77 | +### Queue Implementation |
| 78 | +- **Thread-safe in-memory storage** using Python's deque |
| 79 | +- **Max capacity**: 100 messages |
| 80 | +- **Message ordering**: Preserved from IMAP UID sequence |
| 81 | +- **Retry mechanism**: |
| 82 | + - Failed messages are requeued |
| 83 | + - Exponential backoff between retries |
| 84 | + - Max retry attempts: 5 |
| 85 | +- **Message tracking**: |
| 86 | + - UID-based message identification |
85 | 87 | - Processing state tracking
|
86 |
| - |
87 |
| -### Integration |
88 |
| -- Async startup/shutdown methods |
89 |
| -- Health check status reporting |
90 |
| -- Background processing loop |
91 |
| -- Clean process lifecycle management |
| 88 | + - Error history for failed messages |
0 commit comments