Skip to content

Latest commit

 

History

History
1354 lines (1040 loc) · 34.2 KB

File metadata and controls

1354 lines (1040 loc) · 34.2 KB

ThemisDB Transaction Management Best Practices

Version: 1.4.0
Last Updated: 2026-01-18
Target Audience: Application Developers, Database Administrators, Software Architects


Table of Contents

  1. Transaction Isolation Levels
  2. Transaction Sizing
  3. Conflict Handling Strategies
  4. Common Transaction Patterns
  5. Anti-patterns to Avoid
  6. Performance Optimization

Transaction Isolation Levels

ReadCommitted vs Snapshot Detailed Comparison

ReadCommitted Isolation:

// ReadCommitted: Reads latest committed data
auto txn = tm->begin(IsolationLevel::ReadCommitted);

// Time T1: Read user
auto user = txn->get("user:123");  // Reads version at seq=100

// Time T2: Another transaction commits (seq=101)
// Modifies "user:123" → new version at seq=101

// Time T3: Read same user again in same transaction
auto user2 = txn->get("user:123");  // Reads version at seq=101 (different!)

// Non-repeatable read: user != user2
txn->commit();

Snapshot Isolation:

// Snapshot: Consistent point-in-time view
auto txn = tm->begin(IsolationLevel::Snapshot);
// Creates snapshot at seq=100

// Time T1: Read user
auto user = txn->get("user:123");  // Reads version at seq=100

// Time T2: Another transaction commits (seq=101)
// Modifies "user:123" → new version at seq=101

// Time T3: Read same user again
auto user2 = txn->get("user:123");  // Still reads version at seq=100

// Repeatable read: user == user2 (consistent snapshot)
txn->commit();

Key Differences:

Aspect ReadCommitted Snapshot
Snapshot Created No Yes (at BEGIN)
Read Consistency Latest committed data Point-in-time view
Repeatable Reads ❌ No ✅ Yes
Phantom Reads ✅ Possible ❌ Not possible
Performance ✅ Faster (10-20%) ⚠️ Baseline
Memory Usage ✅ Lower ⚠️ Higher (snapshot overhead)
Snapshot Cleanup N/A Required

When to Use Each Level

Use ReadCommitted For:

Point Queries

# Single-key lookup
def get_user(user_id):
    txn = db.begin(IsolationLevel.ReadCommitted)
    user = txn.get(f"user:{user_id}")
    txn.commit()
    return user

Simple Updates

# Update single record
def update_user_email(user_id, new_email):
    txn = db.begin(IsolationLevel.ReadCommitted)
    user = txn.get(f"user:{user_id}")
    user['email'] = new_email
    txn.put(f"user:{user_id}", user)
    txn.commit()

High-Throughput OLTP

# Process payment (no complex business logic)
def process_payment(payment_id):
    txn = db.begin(IsolationLevel.ReadCommitted)
    payment = txn.get(f"payment:{payment_id}")
    payment['status'] = 'processed'
    txn.put(f"payment:{payment_id}", payment)
    txn.commit()

Use Snapshot For:

Multi-Row Reads

# Generate report with consistent data
def generate_user_report(user_id):
    txn = db.begin(IsolationLevel.Snapshot)  # Consistent view
    
    user = txn.get(f"user:{user_id}")
    orders = txn.query("SELECT * FROM orders WHERE user_id = ?", user_id)
    payments = txn.query("SELECT * FROM payments WHERE user_id = ?", user_id)
    
    # All data from same snapshot → consistent
    report = calculate_report(user, orders, payments)
    txn.commit()
    return report

Business Logic Validation

# Transfer money (requires consistent balance check)
def transfer_money(from_account, to_account, amount):
    txn = db.begin(IsolationLevel.Snapshot)
    
    # Read balances (must be consistent)
    from_balance = txn.get(f"balance:{from_account}")
    to_balance = txn.get(f"balance:{to_account}")
    
    # Validate (consistent snapshot ensures no race condition)
    if from_balance < amount:
        txn.rollback()
        raise InsufficientFundsError()
    
    # Update both accounts
    txn.put(f"balance:{from_account}", from_balance - amount)
    txn.put(f"balance:{to_account}", to_balance + amount)
    
    txn.commit()

Analytics Queries

# Long-running analytics (needs consistent view)
def calculate_daily_revenue(date):
    txn = db.begin(IsolationLevel.Snapshot)
    txn.setTimeout(300000)  # 5 minutes
    
    # All queries see same snapshot
    orders = txn.query("SELECT * FROM orders WHERE date = ?", date)
    refunds = txn.query("SELECT * FROM refunds WHERE date = ?", date)
    
    revenue = sum(o.amount for o in orders) - sum(r.amount for r in refunds)
    
    txn.commit()
    return revenue

Performance Implications

Benchmark Results:

# Workload: 10K transactions, 16 threads, 1KB values

isolation_levels = {
    'ReadCommitted': {
        'throughput_tps': 52000,
        'latency_p50_ms': 0.3,
        'latency_p99_ms': 1.2,
        'memory_mb': 512,
    },
    'Snapshot': {
        'throughput_tps': 43000,  # ~17% slower
        'latency_p50_ms': 0.37,   # +23% latency
        'latency_p99_ms': 1.5,    # +25% latency
        'memory_mb': 768,         # +50% memory
    },
}

for level, metrics in isolation_levels.items():
    print(f"{level}: {metrics}")

Cost Analysis:

def isolation_level_cost_analysis(workload):
    """
    Estimate performance cost of Snapshot vs ReadCommitted.
    """
    baseline_tps = workload['tps']
    
    # Snapshot overhead
    snapshot_overhead_pct = 0.17  # 17% throughput reduction
    memory_overhead_mb = 256  # Additional 256MB for snapshots
    
    snapshot_tps = baseline_tps * (1 - snapshot_overhead_pct)
    
    return {
        'ReadCommitted': {
            'tps': baseline_tps,
            'memory_mb': workload['memory_mb']
        },
        'Snapshot': {
            'tps': snapshot_tps,
            'memory_mb': workload['memory_mb'] + memory_overhead_mb,
            'cost_pct': snapshot_overhead_pct * 100
        }
    }

# Example
result = isolation_level_cost_analysis({
    'tps': 50000,
    'memory_mb': 512
})
print(result)
# Output: Snapshot costs ~17% throughput, +256MB memory

Isolation Level Trade-offs Table

Criteria ReadCommitted Snapshot Winner
Throughput 52K TPS 43K TPS ReadCommitted
Latency (p50) 0.3ms 0.37ms ReadCommitted
Latency (p99) 1.2ms 1.5ms ReadCommitted
Memory Usage 512MB 768MB ReadCommitted
Read Consistency ❌ Non-repeatable ✅ Repeatable Snapshot
Phantom Prevention ❌ No ✅ Yes Snapshot
Complex Queries ❌ Inconsistent ✅ Consistent Snapshot
Analytics ❌ Not recommended ✅ Required Snapshot
OLTP ✅ Ideal ⚠️ Acceptable ReadCommitted
Financial Logic ⚠️ Risk of inconsistency ✅ Safe Snapshot

Transaction Sizing

Optimal Transaction Sizes

Guidelines:

Transaction Size Operations Duration Use Case Conflict Risk
Micro 1-10 ops < 1ms Point queries, cache updates ✅ Very Low
Small 10-100 ops 1-10ms User requests, API calls ✅ Low
Medium 100-1K ops 10-100ms Batch updates, background jobs ⚠️ Medium
Large 1K-10K ops 100ms-1s Data migrations, bulk imports ⚠️ High
XLarge > 10K ops > 1s ❌ Avoid (split into smaller) ❌ Very High

Example Sizing:

# Micro transaction: Single operation
def get_user(user_id):
    txn = db.begin()
    user = txn.get(f"user:{user_id}")
    txn.commit()
    return user  # ~1 operation, < 1ms

# Small transaction: Few operations
def create_order(user_id, items):
    txn = db.begin()
    order_id = generate_id()
    
    # Create order
    txn.put(f"order:{order_id}", {
        'user_id': user_id,
        'items': items,
        'status': 'pending'
    })
    
    # Create order items
    for item in items:
        txn.put(f"order_item:{order_id}:{item['id']}", item)
    
    # Update user order count
    user = txn.get(f"user:{user_id}")
    user['order_count'] += 1
    txn.put(f"user:{user_id}", user)
    
    txn.commit()
    return order_id  # ~5-10 operations, 5-10ms

# Medium transaction: Batch processing
def process_daily_orders(orders):
    BATCH_SIZE = 100
    
    for i in range(0, len(orders), BATCH_SIZE):
        batch = orders[i:i + BATCH_SIZE]
        
        txn = db.begin()
        for order in batch:
            order['status'] = 'processed'
            txn.put(f"order:{order['id']}", order)
        txn.commit()
        # ~100 operations, 50-100ms per transaction

Small Transactions vs Large Batches

Small Transaction Approach:

# Pros: Low conflict risk, fast commits
# Cons: Higher overhead, more network roundtrips

def import_users_small_txn(users):
    for user in users:
        txn = db.begin()
        txn.put(f"user:{user['id']}", user)
        txn.commit()
        # Each user in separate transaction
    
# Performance:
# - 1000 users: ~5 seconds
# - Conflict risk: Very low
# - Memory usage: Very low

Large Batch Approach:

# Pros: Lower overhead, fewer commits
# Cons: Higher conflict risk, longer lock duration

def import_users_large_batch(users):
    txn = db.begin()
    for user in users:
        txn.put(f"user:{user['id']}", user)
    txn.commit()
    # All users in single transaction

# Performance:
# - 1000 users: ~2 seconds
# - Conflict risk: High (holds locks longer)
# - Memory usage: High (large WriteBatch)

Balanced Approach (Recommended):

# Best of both worlds: Medium-sized batches

def import_users_balanced(users, batch_size=100):
    for i in range(0, len(users), batch_size):
        batch = users[i:i + batch_size]
        
        txn = db.begin()
        for user in batch:
            txn.put(f"user:{user['id']}", user)
        txn.commit()
        
        # Allow other transactions to proceed
        time.sleep(0.01)

# Performance:
# - 1000 users: ~2.5 seconds
# - Conflict risk: Low (shorter lock duration)
# - Memory usage: Medium (manageable WriteBatch)

Batch Operation Patterns

Pattern 1: Chunked Import

def chunked_import(data, chunk_size=500):
    """
    Import large dataset in chunks.
    """
    total = len(data)
    processed = 0
    
    for i in range(0, total, chunk_size):
        chunk = data[i:i + chunk_size]
        
        txn = db.begin()
        for item in chunk:
            txn.put(f"item:{item['id']}", item)
        
        try:
            txn.commit()
            processed += len(chunk)
            print(f"Progress: {processed}/{total} ({processed/total*100:.1f}%)")
        except ConflictError:
            print(f"Conflict on chunk {i}, retrying...")
            # Retry logic

Pattern 2: Parallel Batching

from concurrent.futures import ThreadPoolExecutor

def parallel_batch_insert(data, num_workers=4, batch_size=100):
    """
    Parallel batch insert with multiple connections.
    """
    def process_batch(batch):
        txn = db.begin()
        for item in batch:
            txn.put(f"item:{item['id']}", item)
        txn.commit()
        return len(batch)
    
    # Partition data into batches
    batches = [
        data[i:i + batch_size]
        for i in range(0, len(data), batch_size)
    ]
    
    # Process in parallel
    with ThreadPoolExecutor(max_workers=num_workers) as executor:
        results = executor.map(process_batch, batches)
    
    return sum(results)

Pattern 3: Streaming Import

def streaming_import(data_stream, batch_size=100):
    """
    Import streaming data with batching.
    """
    batch = []
    
    for item in data_stream:
        batch.append(item)
        
        if len(batch) >= batch_size:
            txn = db.begin()
            for b in batch:
                txn.put(f"item:{b['id']}", b)
            txn.commit()
            batch = []  # Clear batch
    
    # Flush remaining items
    if batch:
        txn = db.begin()
        for b in batch:
            txn.put(f"item:{b['id']}", b)
        txn.commit()

Multi-Statement Transaction Examples

Example 1: Order Processing

def process_order_transaction(order_id):
    """
    Multi-step order processing in single transaction.
    """
    txn = db.begin(IsolationLevel.Snapshot)
    
    try:
        # Step 1: Get order
        order = txn.get(f"order:{order_id}")
        
        # Step 2: Validate inventory
        for item in order['items']:
            inventory = txn.get(f"inventory:{item['product_id']}")
            if inventory['quantity'] < item['quantity']:
                raise InsufficientInventoryError()
        
        # Step 3: Deduct inventory
        for item in order['items']:
            inventory = txn.get(f"inventory:{item['product_id']}")
            inventory['quantity'] -= item['quantity']
            txn.put(f"inventory:{item['product_id']}", inventory)
        
        # Step 4: Update order status
        order['status'] = 'confirmed'
        txn.put(f"order:{order_id}", order)
        
        # Step 5: Create shipment
        shipment_id = generate_id()
        txn.put(f"shipment:{shipment_id}", {
            'order_id': order_id,
            'status': 'pending'
        })
        
        # All or nothing
        txn.commit()
        return shipment_id
        
    except Exception as e:
        txn.rollback()
        raise

Example 2: Money Transfer

def transfer_money(from_account, to_account, amount):
    """
    Atomic money transfer.
    """
    txn = db.begin(IsolationLevel.Snapshot)
    
    try:
        # Read both account balances
        from_balance = txn.get(f"balance:{from_account}")
        to_balance = txn.get(f"balance:{to_account}")
        
        # Validate balance
        if from_balance['amount'] < amount:
            raise InsufficientFundsError()
        
        # Debit from source
        from_balance['amount'] -= amount
        from_balance['last_updated'] = datetime.now()
        txn.put(f"balance:{from_account}", from_balance)
        
        # Credit to destination
        to_balance['amount'] += amount
        to_balance['last_updated'] = datetime.now()
        txn.put(f"balance:{to_account}", to_balance)
        
        # Record transaction
        transfer_id = generate_id()
        txn.put(f"transfer:{transfer_id}", {
            'from': from_account,
            'to': to_account,
            'amount': amount,
            'timestamp': datetime.now()
        })
        
        # Commit atomically
        txn.commit()
        return transfer_id
        
    except Exception as e:
        txn.rollback()
        raise

Conflict Handling Strategies

Write-Write Conflict Detection

Detection Mechanism:

// ThemisDB detects write-write conflicts via RocksDB TransactionDB

Transaction T1:
  BEGIN
  PUT "user:123" = "Alice"  // Acquires write lock
  // ... long-running operation ...
  COMMIT  // Releases lock

Transaction T2 (concurrent):
  BEGIN
  PUT "user:123" = "Bob"  // Tries to acquire write lock
  // BLOCKED or TIMEOUT (conflict detected)
  
// T2 receives Status::Conflict error

Application Handling:

def update_with_conflict_detection(key, value):
    """
    Update with explicit conflict detection.
    """
    txn = db.begin()
    
    try:
        # Attempt write
        txn.put(key, value)
        txn.commit()
        return True
        
    except ConflictError as e:
        txn.rollback()
        print(f"Conflict detected on key: {key}")
        print(f"Error: {e}")
        return False

Automatic Retry Mechanisms

Simple Retry:

def execute_with_retry(operation, max_attempts=3):
    """
    Retry operation on conflict.
    """
    for attempt in range(max_attempts):
        try:
            return operation()
        except ConflictError:
            if attempt == max_attempts - 1:
                raise  # Give up
            print(f"Conflict, retrying ({attempt + 1}/{max_attempts})")
            time.sleep(0.1)  # Brief delay

# Usage
def update_counter(counter_id):
    def operation():
        txn = db.begin()
        counter = txn.get(f"counter:{counter_id}")
        counter['value'] += 1
        txn.put(f"counter:{counter_id}", counter)
        txn.commit()
        return counter['value']
    
    return execute_with_retry(operation)

Configuration-Based Retry:

# ThemisDB configuration
transaction:
  enable_auto_retry: true
  max_retry_attempts: 3
  retry_backoff_ms: 10

Manual Conflict Resolution

Optimistic Locking with Version:

def update_with_version_check(key, update_fn, max_attempts=10):
    """
    Optimistic locking using version field.
    """
    for attempt in range(max_attempts):
        txn = db.begin()
        
        try:
            # Read current version
            data = txn.get(key)
            current_version = data.get('version', 0)
            
            # Apply update
            updated_data = update_fn(data)
            updated_data['version'] = current_version + 1
            
            # Write with version check
            # (Application-level check, not DB-level)
            existing = txn.get(key)
            if existing['version'] != current_version:
                # Version changed, conflict!
                txn.rollback()
                continue
            
            txn.put(key, updated_data)
            txn.commit()
            return updated_data
            
        except ConflictError:
            txn.rollback()
            # Retry
    
    raise MaxRetriesExceededError()

# Usage
def increment_likes(post_id):
    def update(post):
        post['likes'] += 1
        return post
    
    return update_with_version_check(f"post:{post_id}", update)

Last-Write-Wins Strategy:

def last_write_wins_update(key, value):
    """
    Last write wins (no conflict detection).
    """
    txn = db.begin(IsolationLevel.ReadCommitted)
    
    # No read, just write
    txn.put(key, value)
    txn.commit()
    
    # No conflicts (overwrites any concurrent writes)

Exponential Backoff Implementation

Basic Exponential Backoff:

import time
import random

def exponential_backoff_retry(operation, max_attempts=5):
    """
    Retry with exponential backoff.
    """
    base_delay = 0.1  # 100ms
    max_delay = 10.0  # 10 seconds
    
    for attempt in range(max_attempts):
        try:
            return operation()
            
        except ConflictError:
            if attempt == max_attempts - 1:
                raise
            
            # Calculate backoff delay
            delay = min(base_delay * (2 ** attempt), max_delay)
            
            # Add jitter to prevent thundering herd
            jitter = delay * random.random()
            
            sleep_time = delay + jitter
            print(f"Retry {attempt + 1}/{max_attempts} after {sleep_time:.2f}s")
            time.sleep(sleep_time)

# Usage
result = exponential_backoff_retry(lambda: update_counter("global_counter"))

Advanced Backoff with Metrics:

class ExponentialBackoffRetry:
    def __init__(self, max_attempts=5, base_delay=0.1, max_delay=10.0):
        self.max_attempts = max_attempts
        self.base_delay = base_delay
        self.max_delay = max_delay
        self.metrics = {'total_attempts': 0, 'conflicts': 0, 'successes': 0}
    
    def execute(self, operation):
        for attempt in range(self.max_attempts):
            self.metrics['total_attempts'] += 1
            
            try:
                result = operation()
                self.metrics['successes'] += 1
                return result
                
            except ConflictError as e:
                self.metrics['conflicts'] += 1
                
                if attempt == self.max_attempts - 1:
                    raise
                
                delay = min(self.base_delay * (2 ** attempt), self.max_delay)
                jitter = delay * random.random()
                time.sleep(delay + jitter)
    
    def get_metrics(self):
        return self.metrics

# Usage with monitoring
retry_handler = ExponentialBackoffRetry(max_attempts=5)
result = retry_handler.execute(lambda: update_user("user:123"))
print(f"Metrics: {retry_handler.get_metrics()}")

Common Transaction Patterns

CRUD Patterns

Create:

def create_entity(entity_type, entity_id, data):
    """
    Create new entity.
    """
    txn = db.begin()
    
    # Check if exists
    exists = txn.exists(f"{entity_type}:{entity_id}")
    if exists:
        txn.rollback()
        raise EntityAlreadyExistsError()
    
    # Create entity
    data['created_at'] = datetime.now()
    data['updated_at'] = datetime.now()
    txn.put(f"{entity_type}:{entity_id}", data)
    
    # Update index
    txn.put(f"index:{entity_type}", lambda idx: idx.add(entity_id))
    
    txn.commit()
    return entity_id

Read:

def read_entity(entity_type, entity_id):
    """
    Read entity (no transaction needed for single read).
    """
    txn = db.begin(IsolationLevel.ReadCommitted)
    entity = txn.get(f"{entity_type}:{entity_id}")
    txn.commit()
    return entity

Update:

def update_entity(entity_type, entity_id, updates):
    """
    Update entity with optimistic locking.
    """
    def operation():
        txn = db.begin(IsolationLevel.Snapshot)
        
        # Read current state
        entity = txn.get(f"{entity_type}:{entity_id}")
        
        # Apply updates
        for key, value in updates.items():
            entity[key] = value
        entity['updated_at'] = datetime.now()
        
        # Write back
        txn.put(f"{entity_type}:{entity_id}", entity)
        txn.commit()
        
        return entity
    
    return execute_with_retry(operation)

Delete:

def delete_entity(entity_type, entity_id):
    """
    Delete entity and clean up references.
    """
    txn = db.begin()
    
    # Delete entity
    txn.delete(f"{entity_type}:{entity_id}")
    
    # Clean up index
    txn.put(f"index:{entity_type}", lambda idx: idx.remove(entity_id))
    
    # Soft delete alternative: Mark as deleted
    # entity = txn.get(f"{entity_type}:{entity_id}")
    # entity['deleted'] = True
    # entity['deleted_at'] = datetime.now()
    # txn.put(f"{entity_type}:{entity_id}", entity)
    
    txn.commit()

Distributed Transactions (SAGA Pattern)

SAGA Pattern Implementation:

class Saga:
    """
    SAGA pattern for distributed transactions.
    """
    def __init__(self):
        self.steps = []
        self.compensations = []
    
    def add_step(self, forward_fn, compensate_fn):
        """
        Add step with compensation.
        """
        self.steps.append(forward_fn)
        self.compensations.append(compensate_fn)
    
    def execute(self):
        """
        Execute SAGA with automatic compensation on failure.
        """
        executed_steps = []
        
        try:
            # Execute forward steps
            for i, step in enumerate(self.steps):
                result = step()
                executed_steps.append(i)
                
        except Exception as e:
            # Rollback executed steps in reverse order
            print(f"SAGA failed at step {len(executed_steps)}: {e}")
            
            for step_index in reversed(executed_steps):
                compensate_fn = self.compensations[step_index]
                try:
                    compensate_fn()
                except Exception as comp_error:
                    print(f"Compensation failed: {comp_error}")
            
            raise
        
        return True

# Example: Multi-service order processing
def process_order_saga(order_id):
    saga = Saga()
    
    # Step 1: Reserve inventory
    def reserve_inventory():
        return inventory_service.reserve(order_id)
    
    def release_inventory():
        inventory_service.release(order_id)
    
    saga.add_step(reserve_inventory, release_inventory)
    
    # Step 2: Charge payment
    def charge_payment():
        return payment_service.charge(order_id)
    
    def refund_payment():
        payment_service.refund(order_id)
    
    saga.add_step(charge_payment, refund_payment)
    
    # Step 3: Create shipment
    def create_shipment():
        return shipment_service.create(order_id)
    
    def cancel_shipment():
        shipment_service.cancel(order_id)
    
    saga.add_step(create_shipment, cancel_shipment)
    
    # Execute SAGA
    return saga.execute()

Compensating Transactions

Pattern:

def transfer_with_compensation(from_account, to_account, amount):
    """
    Transfer money with explicit compensation logic.
    """
    compensation_needed = False
    compensation_data = {}
    
    txn = db.begin()
    
    try:
        # Forward transaction
        from_balance = txn.get(f"balance:{from_account}")
        from_balance['amount'] -= amount
        txn.put(f"balance:{from_account}", from_balance)
        compensation_data['from_account'] = from_account
        compensation_data['amount'] = amount
        compensation_needed = True
        
        to_balance = txn.get(f"balance:{to_account}")
        to_balance['amount'] += amount
        txn.put(f"balance:{to_account}", to_balance)
        
        txn.commit()
        return True
        
    except Exception as e:
        txn.rollback()
        
        # Compensate if needed
        if compensation_needed:
            compensate_txn = db.begin()
            from_balance = compensate_txn.get(f"balance:{from_account}")
            from_balance['amount'] += compensation_data['amount']
            compensate_txn.put(f"balance:{from_account}", from_balance)
            compensate_txn.commit()
        
        raise

Atomic Multi-Key Updates

Pattern:

def atomic_multi_key_update(updates):
    """
    Update multiple keys atomically.
    """
    txn = db.begin(IsolationLevel.Snapshot)
    
    try:
        # Read all keys first (for validation)
        current_values = {}
        for key in updates.keys():
            current_values[key] = txn.get(key)
        
        # Validate updates
        for key, new_value in updates.items():
            if not validate_update(current_values[key], new_value):
                raise ValidationError(f"Invalid update for key: {key}")
        
        # Apply all updates
        for key, new_value in updates.items():
            txn.put(key, new_value)
        
        # All or nothing
        txn.commit()
        return True
        
    except Exception as e:
        txn.rollback()
        raise

Anti-patterns to Avoid

Long-Running Transactions

❌ Anti-pattern:

# BAD: Long-running transaction
def bad_batch_process():
    txn = db.begin()
    
    # Process millions of records (takes hours!)
    for i in range(10_000_000):
        record = fetch_record(i)
        processed = process(record)
        txn.put(f"record:{i}", processed)
    
    txn.commit()  # Holds locks for hours!

✅ Solution:

# GOOD: Chunked processing
def good_batch_process():
    CHUNK_SIZE = 1000
    
    for chunk_start in range(0, 10_000_000, CHUNK_SIZE):
        txn = db.begin()
        
        for i in range(chunk_start, min(chunk_start + CHUNK_SIZE, 10_000_000)):
            record = fetch_record(i)
            processed = process(record)
            txn.put(f"record:{i}", processed)
        
        txn.commit()  # Releases locks quickly
        time.sleep(0.01)  # Let other transactions proceed

Nested Transactions

❌ Anti-pattern:

# BAD: Nested transactions (not supported)
def bad_nested_txn():
    outer_txn = db.begin()
    
    # Do some work
    outer_txn.put("key1", "value1")
    
    # Try to nest (WILL FAIL!)
    inner_txn = db.begin()  # Error or undefined behavior
    inner_txn.put("key2", "value2")
    inner_txn.commit()
    
    outer_txn.commit()

✅ Solution:

# GOOD: Sequential transactions
def good_sequential_txn():
    # First transaction
    txn1 = db.begin()
    txn1.put("key1", "value1")
    txn1.commit()
    
    # Second transaction
    txn2 = db.begin()
    txn2.put("key2", "value2")
    txn2.commit()

Deadlock-Prone Patterns

❌ Anti-pattern:

# BAD: Different lock order in concurrent transactions
# Transaction 1:
def transfer_a_to_b():
    txn = db.begin()
    a = txn.get("account:A")  # Lock A
    b = txn.get("account:B")  # Lock B
    # ... transfer logic ...
    txn.commit()

# Transaction 2:
def transfer_b_to_a():
    txn = db.begin()
    b = txn.get("account:B")  # Lock B
    a = txn.get("account:A")  # Lock A (DEADLOCK!)
    # ... transfer logic ...
    txn.commit()

✅ Solution:

# GOOD: Consistent lock order
def transfer(from_account, to_account, amount):
    # Always lock in alphabetical order
    first_account = min(from_account, to_account)
    second_account = max(from_account, to_account)
    
    txn = db.begin()
    
    first = txn.get(f"account:{first_account}")
    second = txn.get(f"account:{second_account}")
    
    # Determine which is from/to
    if from_account == first_account:
        first['balance'] -= amount
        second['balance'] += amount
    else:
        first['balance'] += amount
        second['balance'] -= amount
    
    txn.put(f"account:{first_account}", first)
    txn.put(f"account:{second_account}", second)
    
    txn.commit()

Resource Exhaustion Patterns

❌ Anti-pattern:

# BAD: Unbounded transaction growth
def bad_unlimited_txn():
    txn = db.begin()
    
    # Read unbounded data
    for record in unbounded_stream():  # Could be millions of records!
        txn.put(f"record:{record.id}", record)
    
    txn.commit()  # OOM or transaction too large error

✅ Solution:

# GOOD: Bounded transaction size
def good_bounded_txn(max_txn_size=1000):
    txn = db.begin()
    count = 0
    
    for record in stream():
        txn.put(f"record:{record.id}", record)
        count += 1
        
        if count >= max_txn_size:
            txn.commit()
            txn = db.begin()  # Start new transaction
            count = 0
    
    if count > 0:
        txn.commit()  # Commit remaining

Performance Optimization

Batch Insert/Update Patterns

Optimized Batch Insert:

def optimized_batch_insert(entities, batch_size=1000):
    """
    High-performance batch insert.
    """
    total = len(entities)
    inserted = 0
    start_time = time.time()
    
    for i in range(0, total, batch_size):
        batch = entities[i:i + batch_size]
        
        # Use ReadCommitted for faster writes
        txn = db.begin(IsolationLevel.ReadCommitted)
        
        for entity in batch:
            txn.put(f"entity:{entity['id']}", entity)
        
        txn.commit()
        
        inserted += len(batch)
        
        # Progress reporting
        if inserted % 10000 == 0:
            elapsed = time.time() - start_time
            rate = inserted / elapsed
            eta = (total - inserted) / rate
            print(f"Inserted {inserted}/{total} ({rate:.0f} ops/sec, ETA: {eta:.0f}s)")
    
    elapsed = time.time() - start_time
    print(f"Completed: {total} entities in {elapsed:.2f}s ({total/elapsed:.0f} ops/sec)")

Connection Pooling Recommendations

Configuration:

from themisdb import ConnectionPool

# Create connection pool
pool = ConnectionPool(
    host='localhost',
    port=8765,
    min_connections=5,   # Minimum pool size
    max_connections=50,  # Maximum pool size
    max_idle_time=300,   # 5 minutes idle timeout
    health_check_interval=30  # Health check every 30s
)

# Use connection from pool
def handle_request(request):
    with pool.get_connection() as conn:
        # Connection automatically returned to pool
        result = conn.query(request.sql)
        return result

# Metrics
print(pool.get_stats())
# Output: {'total': 50, 'active': 25, 'idle': 25, 'wait_count': 5}

Best Practices:

Scenario Min Connections Max Connections Rationale
Low Traffic 2-5 10-20 Minimize idle connections
Medium Traffic 10-20 50-100 Balance responsiveness and resources
High Traffic 20-50 100-200 Prevent connection creation overhead
Burst Traffic 10 500+ Handle spikes

Statement Preparation/Caching

Query Preparation:

# Prepare frequently used queries
prepared_queries = {
    'get_user': db.prepare("SELECT * FROM users WHERE id = ?"),
    'get_orders': db.prepare("SELECT * FROM orders WHERE user_id = ? AND status = ?"),
    'update_user': db.prepare("UPDATE users SET email = ? WHERE id = ?"),
}

# Execute prepared query
def get_user(user_id):
    return prepared_queries['get_user'].execute(user_id)

# Performance benefit: ~10-20% faster than ad-hoc queries

Timeout Configuration

Workload-Specific Timeouts:

# Production configuration
transaction:
  # Default timeout for all transactions
  default_timeout_ms: 30000  # 30 seconds
  
  # Per-operation type timeouts
  operation_timeouts:
    point_read: 1000      # 1 second
    range_scan: 5000      # 5 seconds
    batch_write: 60000    # 60 seconds
    analytics: 300000     # 5 minutes
  
  # Lock timeouts
  lock_timeout_ms: 10000  # 10 seconds

Application-Level Override:

# Short timeout for interactive request
def get_user_for_api(user_id):
    txn = db.begin()
    txn.setTimeout(1000)  # 1 second timeout
    user = txn.get(f"user:{user_id}")
    txn.commit()
    return user

# Long timeout for batch job
def batch_export():
    txn = db.begin()
    txn.setTimeout(300000)  # 5 minutes
    data = txn.query("SELECT * FROM large_table")
    txn.commit()
    return data

Related Documentation


Document Version: 1.0
ThemisDB Compatibility: 1.4.0+
Last Reviewed: 2026-01-18