Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

Integration Examples

This directory contains examples that demonstrate integration with external services, APIs, and cloud platforms. These examples show how to build robust, production-ready pipelines that leverage external capabilities.

Examples Overview

Model Context Protocol Tools Integration

  • Comprehensive MCP tool usage
  • Automatic tool detection and server management
  • Multi-tool coordination workflows
  • Project analysis with multiple tools
# Analyze a project with comprehensive scope
python scripts/execution/run_pipeline.py examples/integrations/mcp_tools.yaml \
  -i project_directory="examples/test_project" \
  -i analysis_scope="comprehensive" \
  -i auto_fix=false

# Security-focused analysis with auto-fix
python scripts/execution/run_pipeline.py examples/integrations/mcp_tools.yaml \
  -i project_directory="src/" \
  -i analysis_scope="security_focused" \
  -i auto_fix=true

External API Integration

  • Multi-source data aggregation
  • Authentication and rate limiting
  • Error handling and retry logic
  • Cross-platform data synthesis
# Research across multiple API sources
python scripts/execution/run_pipeline.py examples/integrations/external_apis.yaml \
  -i research_topic="artificial intelligence trends" \
  -i data_sources='["news", "academic", "social"]' \
  -i max_results_per_source=15

# Financial data integration
python scripts/execution/run_pipeline.py examples/integrations/external_apis.yaml \
  -i research_topic="market volatility" \
  -i data_sources='["news", "financial"]' \
  -i include_sentiment=true

Cloud Services Integration

  • Multi-cloud platform support (AWS, Azure, GCP)
  • Cloud AI service utilization
  • Serverless function orchestration
  • Cross-cloud performance comparison
# AWS cloud processing pipeline
python scripts/execution/run_pipeline.py examples/integrations/cloud_services.yaml \
  -i cloud_provider="aws" \
  -i operation_type="ai_pipeline" \
  -i data_file="examples/data/sample_data.json"

# Multi-cloud comparison analysis
python scripts/execution/run_pipeline.py examples/integrations/cloud_services.yaml \
  -i cloud_provider="multi_cloud" \
  -i operation_type="ai_pipeline" \
  -i data_file="examples/data/large_dataset.json"

Integration Patterns Demonstrated

🔗 MCP Tool Integration

  • Automatic Detection: Tools automatically detected from pipeline requirements
  • Server Management: MCP servers started and managed automatically
  • Multi-Tool Coordination: Complex workflows using multiple tools in sequence
  • Error Resilience: Graceful handling when tools are unavailable
  • State Management: Persistent storage of analysis results

🌍 External API Patterns

  • Authentication Handling: Multiple auth methods (API keys, OAuth, tokens)
  • Rate Limiting: Respect API rate limits with intelligent backoff
  • Data Normalization: Standardize data from different API formats
  • Error Recovery: Retry logic and fallback strategies
  • Response Validation: Quality checks on API responses

☁️ Cloud Service Patterns

  • Multi-Cloud Support: Single pipeline works across cloud providers
  • Service Abstraction: Unified interface for similar services
  • Cost Optimization: Intelligent service selection based on requirements
  • Hybrid Workflows: Combine on-premises and cloud processing
  • Performance Monitoring: Track and compare cloud service performance

Advanced Integration Features

🛡️ Security and Authentication

API Key Management:

# Secure API key usage
headers:
  Authorization: "Bearer {{ env.API_KEY }}"
  User-Agent: "Orchestrator Bot 1.0"

Cloud Authentication:

# Cloud service authentication
tool: aws-s3
parameters:
  region: "{{ config.aws.region }}"
  # Uses AWS credentials from environment or IAM roles

MCP Tool Security:

# Automatic tool availability checking
tool: filesystem
action: read
on_failure: skip  # Graceful degradation when tools unavailable

Performance Optimization

Parallel API Calls:

foreach: "{{ api_endpoints }}"
parallel: true
max_concurrent: 3  # Respect rate limits

Intelligent Caching:

# Store results for reuse
tool: memory
action: store
parameters:
  key: "cache_{{ query | hash }}"
  ttl: 3600  # 1 hour cache

Conditional Processing:

# Skip expensive operations when not needed
condition: "{{ 'premium' in user_tier }}"

🔄 Error Handling and Resilience

Retry Strategies:

retry: 3
backoff_strategy: "exponential"
max_backoff_time: 30

Graceful Degradation:

on_failure: continue  # Keep processing other sources
fallback_action: use_cached_data

Circuit Breaker Pattern:

# Automatically disable failing services
max_consecutive_failures: 5
disable_duration: 300  # 5 minutes

Service-Specific Examples

📊 Database Integration

# Multi-database support
tool: database
action: query
parameters:
  connection_string: "{{ env.DATABASE_URL }}"
  query: "SELECT * FROM research WHERE topic = ?"
  parameters: ["{{ topic }}"]

📈 Analytics Integration

# Analytics service integration
tool: analytics
action: track_event
parameters:
  event: "pipeline_execution"
  properties:
    pipeline_id: "{{ pipeline.id }}"
    duration: "{{ execution_time }}"

📧 Notification Integration

# Multi-channel notifications
tool: notifications
action: send
parameters:
  channels: ["email", "slack", "webhook"]
  message: "Pipeline completed: {{ pipeline.name }}"
  metadata: "{{ outputs }}"

Requirements by Example

MCP Tools Example

  • MCP Server: Orchestrator MCP server running
  • Tool Availability: filesystem, code-analyzer, git, memory, terminal, browser tools
  • Permissions: File system read/write access
  • Dependencies: Git repository (optional)

External APIs Example

  • API Keys: News API, Academic API, Social API, Financial API keys
  • Network Access: Outbound HTTP/HTTPS requests
  • Rate Limits: Respect for API quotas and limits
  • Data Storage: Database for result persistence (optional)

Cloud Services Example

  • Cloud Credentials: AWS, Azure, and/or GCP authentication
  • Service Permissions: Storage, compute, AI service access
  • Network Access: Cloud service endpoints
  • Resource Quotas: Sufficient cloud service limits

Configuration Management

Environment Variables

# API Keys
export NEWS_API_KEY="your_news_api_key"
export ANTHROPIC_API_KEY="your_anthropic_key"
export OPENAI_API_KEY="your_openai_key"

# Cloud Credentials  
export AWS_ACCESS_KEY_ID="your_aws_key"
export AWS_SECRET_ACCESS_KEY="your_aws_secret"
export AZURE_CLIENT_ID="your_azure_client_id"
export GOOGLE_APPLICATION_CREDENTIALS="path/to/gcp-key.json"

# Database
export DATABASE_URL="postgresql://user:pass@localhost/db"

Configuration Files

# config/integrations.yaml
apis:
  rate_limits:
    default: 100  # requests per hour
    premium: 1000
  timeouts:
    default: 30   # seconds
    file_upload: 300

cloud:
  preferred_regions:
    aws: "us-east-1"
    azure: "East US"  
    gcp: "us-central1"
  cost_optimization: true
  auto_scaling: true

Best Practices

🎯 Design Principles

  • Fail Fast: Validate external dependencies early
  • Graceful Degradation: Continue processing when non-critical services fail
  • Idempotency: Ensure operations can be safely retried
  • Monitoring: Track external service performance and reliability
  • Security: Never expose credentials in pipeline definitions

🔧 Implementation Guidelines

  • Rate Limiting: Respect external service limits
  • Caching: Cache expensive API calls when appropriate
  • Timeout Handling: Set reasonable timeouts for all external calls
  • Error Classification: Distinguish between retryable and permanent errors
  • Documentation: Document external service dependencies clearly

📊 Monitoring and Observability

  • Success Rates: Track API call success/failure rates
  • Response Times: Monitor external service performance
  • Cost Tracking: Monitor cloud service usage and costs
  • Alert Thresholds: Set alerts for service degradation
  • Dependency Health: Monitor external service status

Troubleshooting

Common Integration Issues

Authentication Failures:

  • Check API keys are correctly configured
  • Verify cloud credentials have sufficient permissions
  • Check for expired tokens or certificates

Rate Limiting:

  • Reduce concurrent request limits
  • Implement exponential backoff
  • Consider upgrading API plans for higher limits

Network Issues:

  • Check firewall and proxy settings
  • Verify DNS resolution for service endpoints
  • Test connectivity with simple curl commands

Service Unavailability:

  • Implement circuit breaker patterns
  • Use health check endpoints when available
  • Have fallback strategies for critical dependencies

Performance Optimization

Slow API Responses:

  • Implement timeout handling
  • Use parallel processing where appropriate
  • Cache frequently accessed data
  • Consider service geographic proximity

High Costs:

  • Monitor cloud service usage
  • Use spot instances and reserved capacity
  • Implement auto-scaling policies
  • Regular cost optimization reviews

Security Considerations

🔒 Authentication Security

  • Store credentials securely (environment variables, key vaults)
  • Use least-privilege access principles
  • Rotate credentials regularly
  • Monitor for credential compromise

🛡️ Data Security

  • Encrypt data in transit and at rest
  • Implement proper access controls
  • Log and monitor data access
  • Follow data residency requirements

🚨 Monitoring Security

  • Track unusual API usage patterns
  • Monitor for credential stuffing attacks
  • Set up alerts for security events
  • Regular security audits of integrations

Next Steps

After mastering integration examples, explore:

Contributing

When creating new integration examples:

  • Include comprehensive error handling
  • Document all external dependencies
  • Provide configuration templates
  • Add monitoring and observability features
  • Include security best practices
  • Test across different environments