-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Background
The MSF project involves mapping a high volume of forms and data elements from OpenMRS to DHIS2 using a dynamically-read metadata file. Currently testing is time-consuming, particularly for custom mappings that require transformations, value splitting, and conditional logic (e.g., multi-pick lists to booleans). With 50-60+ new forms being continuously added, we need to automate testing to improve efficiency and reduce errors.
Problem Statement
Testing the integration between OpenMRS and DHIS2 is challenging because:
- Standard mappings are mostly one-to-one but still need verification
- Custom mappings require complex transformations and conditional logic that are error-prone
- Manual generation of test data and expected outputs is tedious
- Need to test various scenarios including null values and all option combinations for adequate coverage
- Custom logic specifications are sometimes written in difficult-to-parse formats
Proposed Solution
Use an LLM (Claude) to automate test generation and validation through the following approach:
Phase 1: Proof of Concept (Single Form)
- Input Generation: LLM generates various input data scenarios to ensure sufficient coverage, including edge cases, null values, and all option combinations
- Expected Output Generation: LLM uses only the metadata file to generate expected DHIS2 event payloads (not the project's existing YAML code, to avoid biasing tests)
- Test Execution: OpenFn workflow reads inputs from Google Sheet, executes MSF workflow, captures actual output, and writes back to the same sheet
- Assertion: JavaScript script in Google Sheet compares actual vs. expected outputs
Architecture Overview
Google Sheet (Test Cases)
↓ (read test inputs)
OpenFn Test Workflow
↓ (execute)
MSF Production Workflow
↓ (capture actual output)
OpenFn Test Workflow
↓ (write actual output)
Google Sheet (Test Cases + Actual Outputs)
↓ (run assertion script)
Test Results (Pass/Fail + Differences)
Key Design Decisions
- Expected output = the events mapping JSON body sent to DHIS2 (not the DHIS2 response after event creation)
- LLM should work from metadata file only to generate expected outputs
- Final assertion should be deterministic JavaScript code in Google Sheet
- Google Sheet serves as both input source and results repository
- Start with manual POC before building automation workflow
Deliverables
1. Google Sheet Template
Create a Google Sheet with the following structure (@jfeldman-openfn I think you already had the columns set, does this align with what you had?):
Required Columns:
test_id- Unique identifier for each test caseform_name- Name of the form being testedtest_scenario- Description of what's being tested (e.g., "null values", "multi-pick to boolean")input_payload- OpenMRS input data (JSON string or structured columns)expected_output- DHIS2 event payload expected (JSON string) - LLM generated from metadataactual_output- DHIS2 event payload produced by workflow (JSON string) - populated by OpenFn workflowtest_status- Pass/Fail - computed by assertion scriptdifferences- Details of mismatches if test fails - computed by assertion scriptlast_run- Timestamp of last test executionnotes- Additional context or observations
Sheet Features:
- Set up Google Sheet with designed column structure
- Add data validation and formatting where appropriate
- Include sample test cases for initial POC
- Create JavaScript assertion script (Apps Script) that:
- Compares
expected_outputvsactual_outputfor each row - Normalizes JSON before comparison (handle whitespace, property order)
- Populates
test_status(Pass/Fail) - Populates
differenceswith detailed mismatch information - Can be triggered manually or automatically after workflow execution
- Compares
2. OpenFn Test Execution Workflow
Build an OpenFn workflow that automates test execution:
Workflow Requirements:
Step 1: Read Test Cases from Google Sheet
- Use
@openfn/language-googlesheetsadaptor - Read all rows from the test cases sheet where
actual_outputis empty or needs re-testing - Parse
input_payloadcolumn into proper JSON format - Handle pagination if there are many test cases
Step 2: Execute MSF Workflow for Each Test Case
- For each test case, invoke the MSF production workflow with the
input_payload - This may require:
- Setting up the workflow to accept test inputs programmatically
- Ensuring test execution doesn't affect production data (use test DHIS2 instance or mock endpoints)
- Capturing the transformed payload before it's sent to DHIS2 (the events mapping JSON)
- Handle errors gracefully (workflow failures should be logged, not crash entire test run)
Step 3: Capture Actual Output
- Extract the DHIS2 event payload that would be sent (before actual DHIS2 API call)
- This is the critical output needed for comparison
- Convert output to JSON string format for storage
Step 4: Write Results Back to Google Sheet
- Update the same row in Google Sheet with:
actual_output- the captured DHIS2 event payloadlast_run- current timestamp
- Ensure atomic updates (don't lose data if workflow fails mid-execution)
- Handle Google Sheets API rate limits appropriately
Workflow Configuration:
- Workflow should accept parameters:
sheet_id- Google Sheet IDsheet_name- Tab name (default: "Test Cases")test_mode- Boolean to ensure safe executiontest_ids- Optional array to run specific tests only
- Include comprehensive error handling and logging
- Add retry logic for transient failures
- Log execution summary (total tests run, successes, failures)
Workflow Considerations:
- Should the workflow run all tests sequentially or support parallel execution?
- How to handle workflows that depend on state or previous data?
- Should test execution be isolated (separate state/credentials)?
- How to prevent test data from reaching production DHIS2?
3. Integration & Testing
- Verify workflow correctly reads from Google Sheet
- Confirm workflow successfully executes MSF transformation logic
- Validate actual outputs are written back to correct rows
- Test assertion script correctly identifies matches and mismatches
- Document any edge cases or limitations discovered
Implementation Tasks
Phase 1: Setup & POC
- Select representative form for POC (standard + custom mappings)
- Prepare metadata file and context for LLM
- Design Google Sheet structure with Justin
- Create Google Sheet template with assertion script
- Use Claude to generate test inputs covering various scenarios
- Use Claude to generate expected outputs from metadata file
- Test the process, determine validity + time savings, and whether we want to proceed with Phase 2
- Share outcomes with MSF team for their inputs (and potential budget sign-off)
Phase 2: Workflow Development
- Build OpenFn test workflow per requirements above
- Set up test environment/credentials (non-production DHIS2 if needed)
- Implement Google Sheets read/write operations
- Implement MSF workflow invocation and output capture
- Add error handling and logging
- Test workflow with POC form
Phase 3: Validation
- Run workflow with POC test cases
- Execute assertion script to compare outputs
- Document what contextual information LLM needs
- Evaluate LLM performance on complex conditional logic
- Assess time savings vs manual testing
Technical Specifications
OpenFn Workflow Structure
job-1: Read test cases from Google Sheet
├─ Get all rows from sheet
├─ Filter for tests to run
└─ Parse input payloads
job-2: Execute MSF workflow (for each test case)
├─ Invoke MSF transformation logic
├─ Capture DHIS2 event payload (actual output)
└─ Store in state
job-3: Write results to Google Sheet
├─ Update rows with actual outputs
├─ Add timestamps
└─ Log execution summary
Google Sheet Assertion Script (Pseudocode)
function runAssertions() {
// Get test data
// For each row with both expected and actual outputs:
// - Parse JSON strings
// - Normalize for comparison
// - Deep compare objects
// - Mark pass/fail
// - Document differences
// - Update sheet
}Acceptance Criteria
- Google Sheet properly structured and assertion script functional
- OpenFn workflow successfully reads inputs, executes MSF logic, and writes outputs
- LLM successfully generates comprehensive input test data covering edge cases
- LLM accurately produces expected outputs based solely on metadata file
- Assertion script correctly identifies mismatches with detailed diff information
- Process significantly reduces manual testing time compared to current approach
- Approach is repeatable for additional forms
Notes
- Assertion logic should be version controlled (export Apps Script to repo)