Skip to content

POC of AI-Assisted Testing for MSF OpenMRS → DHIS2 Integration #185

@AishaHassen

Description

@AishaHassen

Background

The MSF project involves mapping a high volume of forms and data elements from OpenMRS to DHIS2 using a dynamically-read metadata file. Currently testing is time-consuming, particularly for custom mappings that require transformations, value splitting, and conditional logic (e.g., multi-pick lists to booleans). With 50-60+ new forms being continuously added, we need to automate testing to improve efficiency and reduce errors.

Problem Statement

Testing the integration between OpenMRS and DHIS2 is challenging because:

  • Standard mappings are mostly one-to-one but still need verification
  • Custom mappings require complex transformations and conditional logic that are error-prone
  • Manual generation of test data and expected outputs is tedious
  • Need to test various scenarios including null values and all option combinations for adequate coverage
  • Custom logic specifications are sometimes written in difficult-to-parse formats

Proposed Solution

Use an LLM (Claude) to automate test generation and validation through the following approach:

Phase 1: Proof of Concept (Single Form)

  1. Input Generation: LLM generates various input data scenarios to ensure sufficient coverage, including edge cases, null values, and all option combinations
  2. Expected Output Generation: LLM uses only the metadata file to generate expected DHIS2 event payloads (not the project's existing YAML code, to avoid biasing tests)
  3. Test Execution: OpenFn workflow reads inputs from Google Sheet, executes MSF workflow, captures actual output, and writes back to the same sheet
  4. Assertion: JavaScript script in Google Sheet compares actual vs. expected outputs

Architecture Overview

Google Sheet (Test Cases)
    ↓ (read test inputs)
OpenFn Test Workflow
    ↓ (execute)
MSF Production Workflow
    ↓ (capture actual output)
OpenFn Test Workflow
    ↓ (write actual output)
Google Sheet (Test Cases + Actual Outputs)
    ↓ (run assertion script)
Test Results (Pass/Fail + Differences)

Key Design Decisions

  • Expected output = the events mapping JSON body sent to DHIS2 (not the DHIS2 response after event creation)
  • LLM should work from metadata file only to generate expected outputs
  • Final assertion should be deterministic JavaScript code in Google Sheet
  • Google Sheet serves as both input source and results repository
  • Start with manual POC before building automation workflow

Deliverables

1. Google Sheet Template

Create a Google Sheet with the following structure (@jfeldman-openfn I think you already had the columns set, does this align with what you had?):

Required Columns:

  • test_id - Unique identifier for each test case
  • form_name - Name of the form being tested
  • test_scenario - Description of what's being tested (e.g., "null values", "multi-pick to boolean")
  • input_payload - OpenMRS input data (JSON string or structured columns)
  • expected_output - DHIS2 event payload expected (JSON string) - LLM generated from metadata
  • actual_output - DHIS2 event payload produced by workflow (JSON string) - populated by OpenFn workflow
  • test_status - Pass/Fail - computed by assertion script
  • differences - Details of mismatches if test fails - computed by assertion script
  • last_run - Timestamp of last test execution
  • notes - Additional context or observations

Sheet Features:

  • Set up Google Sheet with designed column structure
  • Add data validation and formatting where appropriate
  • Include sample test cases for initial POC
  • Create JavaScript assertion script (Apps Script) that:
    • Compares expected_output vs actual_output for each row
    • Normalizes JSON before comparison (handle whitespace, property order)
    • Populates test_status (Pass/Fail)
    • Populates differences with detailed mismatch information
    • Can be triggered manually or automatically after workflow execution

2. OpenFn Test Execution Workflow

Build an OpenFn workflow that automates test execution:

Workflow Requirements:

Step 1: Read Test Cases from Google Sheet

  • Use @openfn/language-googlesheets adaptor
  • Read all rows from the test cases sheet where actual_output is empty or needs re-testing
  • Parse input_payload column into proper JSON format
  • Handle pagination if there are many test cases

Step 2: Execute MSF Workflow for Each Test Case

  • For each test case, invoke the MSF production workflow with the input_payload
  • This may require:
    • Setting up the workflow to accept test inputs programmatically
    • Ensuring test execution doesn't affect production data (use test DHIS2 instance or mock endpoints)
    • Capturing the transformed payload before it's sent to DHIS2 (the events mapping JSON)
  • Handle errors gracefully (workflow failures should be logged, not crash entire test run)

Step 3: Capture Actual Output

  • Extract the DHIS2 event payload that would be sent (before actual DHIS2 API call)
  • This is the critical output needed for comparison
  • Convert output to JSON string format for storage

Step 4: Write Results Back to Google Sheet

  • Update the same row in Google Sheet with:
    • actual_output - the captured DHIS2 event payload
    • last_run - current timestamp
  • Ensure atomic updates (don't lose data if workflow fails mid-execution)
  • Handle Google Sheets API rate limits appropriately

Workflow Configuration:

  • Workflow should accept parameters:
    • sheet_id - Google Sheet ID
    • sheet_name - Tab name (default: "Test Cases")
    • test_mode - Boolean to ensure safe execution
    • test_ids - Optional array to run specific tests only
  • Include comprehensive error handling and logging
  • Add retry logic for transient failures
  • Log execution summary (total tests run, successes, failures)

Workflow Considerations:

  • Should the workflow run all tests sequentially or support parallel execution?
  • How to handle workflows that depend on state or previous data?
  • Should test execution be isolated (separate state/credentials)?
  • How to prevent test data from reaching production DHIS2?

3. Integration & Testing

  • Verify workflow correctly reads from Google Sheet
  • Confirm workflow successfully executes MSF transformation logic
  • Validate actual outputs are written back to correct rows
  • Test assertion script correctly identifies matches and mismatches
  • Document any edge cases or limitations discovered

Implementation Tasks

Phase 1: Setup & POC

  • Select representative form for POC (standard + custom mappings)
  • Prepare metadata file and context for LLM
  • Design Google Sheet structure with Justin
  • Create Google Sheet template with assertion script
  • Use Claude to generate test inputs covering various scenarios
  • Use Claude to generate expected outputs from metadata file
  • Test the process, determine validity + time savings, and whether we want to proceed with Phase 2
  • Share outcomes with MSF team for their inputs (and potential budget sign-off)

Phase 2: Workflow Development

  • Build OpenFn test workflow per requirements above
  • Set up test environment/credentials (non-production DHIS2 if needed)
  • Implement Google Sheets read/write operations
  • Implement MSF workflow invocation and output capture
  • Add error handling and logging
  • Test workflow with POC form

Phase 3: Validation

  • Run workflow with POC test cases
  • Execute assertion script to compare outputs
  • Document what contextual information LLM needs
  • Evaluate LLM performance on complex conditional logic
  • Assess time savings vs manual testing

Technical Specifications

OpenFn Workflow Structure

job-1: Read test cases from Google Sheet
  ├─ Get all rows from sheet
  ├─ Filter for tests to run
  └─ Parse input payloads

job-2: Execute MSF workflow (for each test case)
  ├─ Invoke MSF transformation logic
  ├─ Capture DHIS2 event payload (actual output)
  └─ Store in state

job-3: Write results to Google Sheet
  ├─ Update rows with actual outputs
  ├─ Add timestamps
  └─ Log execution summary

Google Sheet Assertion Script (Pseudocode)

function runAssertions() {
  // Get test data
  // For each row with both expected and actual outputs:
  //   - Parse JSON strings
  //   - Normalize for comparison
  //   - Deep compare objects
  //   - Mark pass/fail
  //   - Document differences
  //   - Update sheet
}

Acceptance Criteria

  • Google Sheet properly structured and assertion script functional
  • OpenFn workflow successfully reads inputs, executes MSF logic, and writes outputs
  • LLM successfully generates comprehensive input test data covering edge cases
  • LLM accurately produces expected outputs based solely on metadata file
  • Assertion script correctly identifies mismatches with detailed diff information
  • Process significantly reduces manual testing time compared to current approach
  • Approach is repeatable for additional forms

Notes

  • Assertion logic should be version controlled (export Apps Script to repo)

Sub-issues

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions