From 91cfaf39464e9d393fed24d109f65fbf54228d31 Mon Sep 17 00:00:00 2001 From: Bhavini20-CODER Date: Fri, 22 May 2026 13:50:46 +0530 Subject: [PATCH 1/4] docs: add core workflow mapping and test plan for issue #19 --- TEST_PLAN.md | 163 +++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 163 insertions(+) create mode 100644 TEST_PLAN.md diff --git a/TEST_PLAN.md b/TEST_PLAN.md new file mode 100644 index 0000000..8700100 --- /dev/null +++ b/TEST_PLAN.md @@ -0,0 +1,163 @@ +# ObservaKit Test Plan + +This is my test plan for the ObservaKit data observability project assignment. I have analyzed the codebase, run the tests, and documented how the core features work. + +--- + +## 1. What is included and what is not + +### What is working / in-scope: +* **Core Monitoring Features**: Checking data freshness, tracking row counts (volume checks) with backfill detection, running quality checks (Soda Core & Custom SQL queries), detecting schema drift, validating data contracts, and recording pipeline runs. +* **Alerting**: Managing alert channels (Slack, Teams, PagerDuty, Email, Discord, Webhooks), setting up mute windows (suppressions), and automatic alert deduplication (noise reduction). +* **API endpoints**: Standard endpoints to trigger and query results. +* **Local environment**: Runs on SQLite for local development and PostgreSQL for Docker. + +### What is not implemented / out-of-scope: +* **Great Expectations**: The code has a function for Great Expectations quality checks, but it is just a placeholder and returns a fake failure. It is not implemented. +* **Roadmap features**: Future items like the dashboard UI editor, incident timelines, and column lineage are not in the current codebase. + +--- + +## 2. The 7 core flows I checked + +### F1: Booting the project and checking API health +* **What I checked**: Setting up the docker containers, running migrations, and hit the health endpoints. +* **Files and folders involved**: + * `docker-compose.yml` + * `backend/main.py` (FastAPI app setup) + * `backend/models.py` (Database tables definition) + * `backend/scheduler.py` (Runs background checks) + * `alembic/` (Database migrations folder) +* **Endpoints**: + * `GET /healthz` (Check if DB connection is active) + * `GET /status` (Summary of table health in last 24 hours) +* **Inputs needed**: + * Env variables like `OBSERVAKIT_API_KEY` and DB credentials. +* **Expected result**: + * Running `docker compose up` starts everything and automatically runs database migrations. + * `/healthz` returns `{"status": "ok"}` when the database is connected. +* **Status**: `✅ appears to work` + +--- + +### F2: Running a data freshness check +* **What I checked**: How the system monitors if tables are receiving fresh data based on SLA rules. +* **Files and folders involved**: + * `backend/routers/freshness.py` + * `connectors/base.py` (Used to query target databases) + * `config/kit.yml` (Where we configure freshness SLAs) +* **Endpoints**: + * `POST /freshness/poll` (Runs the freshness checks) +* **Inputs needed**: + * Configuration in `config/kit.yml` indicating the table name, `timestamp_column`, and SLA thresholds (`warn_after`, `fail_after`). +* **Expected result**: + * The code queries the target database for the newest timestamp in the column. + * It calculates lag (`now - max_timestamp`) and records a `FreshnessRecord` in the DB. + * If the lag is too high, it sends an alert to the configured channel. +* **Status**: `✅ appears to work` + +--- + +### F3: Running data quality checks +* **What I checked**: Checking data quality rules using Soda Core, custom SQL queries, consistency comparisons, and row count anomalies. +* **Files and folders involved**: + * `backend/routers/checks.py` + * `checks/` (Where Soda Core YAML files are kept) + * `backend/models.py` (`CheckResult`, `VolumeRecord`, `BackfillEvent` tables) +* **Endpoints**: + * `POST /checks/run` (Runs quality checks) + * `POST /checks/volume` (Runs volume checks) +* **Inputs needed**: + * Soda Core YAML configurations, custom SQL queries in `kit.yml`, and historical row counts to detect volume anomalies. +* **Expected result**: + * **Soda Core**: The code starts a subprocess to run `soda scan`, reads the JSON output, and logs the results. + * **Custom SQL**: Executes SQL and checks if assertions pass (e.g. `result == 0`). + * **Volume checks**: Compares row count to 7-day average. If there's a big change, it checks if it's a "backfill" (historical data imports) to avoid false alert alarms. +* **Status**: + * **Soda Core Scan**: `⚠️ unclear` (Requires Soda CLI to be installed on the machine; if missing, the scan fails). + * **Great Expectations**: `❌ broken/missing` (Not implemented in code; returns a dummy failure). + * **Custom SQL, Consistency & Volume**: `✅ appears to work` + +--- + +### F4: Detecting schema drift +* **What I checked**: Finding out if someone modified columns in the warehouse tables. +* **Files and folders involved**: + * `backend/routers/schema_diff.py` + * `connectors/base.py` (To fetch table metadata) + * `backend/models.py` (`SchemaSnapshot` and `SchemaDiff` tables) +* **Endpoints**: + * `POST /schema/snapshot` (Runs comparison) +* **Inputs needed**: + * List of tables to watch in `config/kit.yml` under `schema_drift:`. +* **Expected result**: + * Saves the current list of columns and types as a `SchemaSnapshot`. + * Compares it to the previous snapshot. If columns are added, removed, or type is changed, it logs the change into `SchemaDiff` and alerts the user. +* **Status**: `✅ appears to work` + +--- + +### F5: Data contract validation +* **What I checked**: Validating tables against a YAML schema contract defined by developers. +* **Files and folders involved**: + * `backend/routers/contracts.py` + * `config/contracts/` (YAML files detailing the contracts) +* **Endpoints**: + * `POST /contracts/validate` (Validates contract files) +* **Inputs needed**: + * Contract YAML definitions stating column types, null limits, allowed values, min rows, and custom rules. +* **Expected result**: + * Queries database to verify if all columns exist and have the correct types. + * Performs counts on nulls, duplicate values, value ranges, and custom business rules. + * Logs the validation results. If any check fails, triggers a contract violation alert. +* **Status**: `✅ appears to work` + +--- + +### F6: Alert suppression and noise reduction +* **What I checked**: How the system avoids sending too many alerts during maintenance or when a check fails repeatedly. +* **Files and folders involved**: + * `backend/routers/suppressions.py` (Mute endpoints) + * `backend/routers/alert_noise.py` (Noise tracking endpoints) + * `alerts/base.py` (Dispatches alerts and checks deduplication) +* **Endpoints**: + * `POST /suppress/` (To mute a table's alerts) + * `GET /alerts/noise/summary` (See noisy alerts) +* **Inputs needed**: + * Suppression details (table name, mute time, reason) and historical alert counts. +* **Expected result**: + * Before sending an alert, the system checks if the table is muted or if a similar alert was sent recently (using an adaptive time window based on a calculated noise score). + * If yes, the alert is skipped. If sent, it increments the noise counters. +* **Status**: `✅ appears to work` + +--- + +### F7: Pipeline webhook integration +* **What I checked**: Receiving callback requests from Airflow or Prefect to log pipeline runs. +* **Files and folders involved**: + * `backend/routers/webhooks.py` + * `backend/models.py` (`PipelineRun` table) +* **Endpoints**: + * `POST /webhooks/airflow` + * `POST /webhooks/prefect` +* **Inputs needed**: + * JSON payloads from Airflow or Prefect with DAG details and states. +* **Expected result**: + * Parses the payload, normalizes status values, and inserts a `PipelineRun` record. +* **Status**: `⚠️ unclear` (I noticed the endpoints do not perform payload checks or authentication keys, so anyone could call these endpoints and insert dummy data). + +--- + +## 3. Major gaps and things that might break + +* **Great Expectations is missing**: If someone configures `engine: great_expectations` in the config, it will fail because the runner is not coded. +* **SQLite vs PostgreSQL locking**: In PostgreSQL, the scheduler locks tasks so multiple servers don't run them at the same time. On SQLite, it doesn't do anything, which might cause double runs during testing. +* **No alert retry logic**: If Slack or Teams is down, the alert fails. It records it in `AlertLog`, but it won't try to send it again later. +* **Soda Core CLI dependency**: The quality check depends on having the `soda` command line tool installed on the server. If it isn't, the tests fail. + +--- + +## 4. Things I couldn't test or verify + +* **Database performance**: I couldn't test how fast the schema diff fetches column data if the database has thousands of tables. +* **Third-party API rate limits**: I am not sure if Slack or Teams will block us if we send hundreds of alerts before the noise throttling turns on. From 64b7fa96958d442eac3aa7fe4543fbf6a2b3a9e7 Mon Sep 17 00:00:00 2001 From: bhavini-verma Date: Fri, 22 May 2026 18:20:27 +0530 Subject: [PATCH 2/4] Clarify status of test case in TEST_PLAN.md Updated the status of the test case to indicate uncertainty regarding payload checks and authentication. --- TEST_PLAN.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/TEST_PLAN.md b/TEST_PLAN.md index 8700100..4743c93 100644 --- a/TEST_PLAN.md +++ b/TEST_PLAN.md @@ -144,7 +144,7 @@ This is my test plan for the ObservaKit data observability project assignment. I * JSON payloads from Airflow or Prefect with DAG details and states. * **Expected result**: * Parses the payload, normalizes status values, and inserts a `PipelineRun` record. -* **Status**: `⚠️ unclear` (I noticed the endpoints do not perform payload checks or authentication keys, so anyone could call these endpoints and insert dummy data). +* **Status**: `⚠️ unclear` --- From d7f42599f2875c4f52064455e357e88bc62af226 Mon Sep 17 00:00:00 2001 From: bhavini-verma Date: Sat, 23 May 2026 16:54:08 +0530 Subject: [PATCH 3/4] Add future testing considerations to TEST_PLAN.md --- TEST_PLAN.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/TEST_PLAN.md b/TEST_PLAN.md index 4743c93..f5c77f3 100644 --- a/TEST_PLAN.md +++ b/TEST_PLAN.md @@ -161,3 +161,13 @@ This is my test plan for the ObservaKit data observability project assignment. I * **Database performance**: I couldn't test how fast the schema diff fetches column data if the database has thousands of tables. * **Third-party API rate limits**: I am not sure if Slack or Teams will block us if we send hundreds of alerts before the noise throttling turns on. + +--- + +## 5. Future Testing Considerations / Improvements + +* **Edge Case Coverage**: Expand test scenarios to cover failure modes and edge cases, such as: + * Database connection failures (bad DB connection). + * Malformed or invalid configurations (bad config). + * Schema type mismatches and wrong data types. + * Other potential system failure modes. From dec3eed9ed9f4642a61b84ceb39280a7f3c0b603 Mon Sep 17 00:00:00 2001 From: bhavini-verma Date: Sat, 23 May 2026 23:04:59 +0530 Subject: [PATCH 4/4] Enhance TEST_PLAN with integration testing matrix Added a detailed integration testing matrix for various tasks and their expected outcomes. --- TEST_PLAN.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/TEST_PLAN.md b/TEST_PLAN.md index f5c77f3..acc8f00 100644 --- a/TEST_PLAN.md +++ b/TEST_PLAN.md @@ -171,3 +171,17 @@ This is my test plan for the ObservaKit data observability project assignment. I * Malformed or invalid configurations (bad config). * Schema type mismatches and wrong data types. * Other potential system failure modes. + +### Detailed Integration Testing Matrix + +| Step | Function | Detailed Description & Expected Outcome | File (click) | Line range | +|---|---|---|---|---| +| **Task 1 – Ingestion & Freshness** | `poll_freshness` | Executes freshness checks for every table defined in `kit.yml`. For each table it creates a `FreshnessRecord` with `status: ok` when the data is fresh. **No alert is generated** in this happy‑path scenario. | [freshness.py](file:///c:/Users/bhavi/ObservaKit/backend/routers/freshness.py) | 76‑84 | +| **Task 2 – Schema Drift (Initial Snapshot)** | `take_snapshot` | Captures the current schema of each configured table and persists a `SchemaSnapshot`. Since this is the first run, **no `SchemaDiff` records** are produced. | [schema_diff.py](file:///c:/Users/bhavi/ObservaKit/backend/routers/schema_diff.py) | 21‑27 | +| **Task 3 – Schema Change & Drift Alert** | `take_snapshot` (re‑run) + `_trigger_schema_alert` | After a deliberate schema modification (e.g., adding a column), a second `take_snapshot` detects the differences, creates one or more `SchemaDiff` entries and **dispatches a drift alert** via `_trigger_schema_alert`. | [schema_diff.py](file:///c:/Users/bhavi/ObservaKit/backend/routers/schema_diff.py) & [base.py](file:///c:/Users/bhavi/ObservaKit/alerts/base.py) | 21‑27, 326‑336 | +| **Task 4 – Data Contract Validation** | `validate_contracts` | Loads contract YAML files, validates column presence, data types, nullability, uniqueness, allowed values and any custom SQL rules. Persists a `ContractValidationResult`; if any rule fails, a **contract‑failure alert** is emitted. | [contracts.py](file:///c:/Users/bhavi/ObservaKit/backend/routers/contracts.py) | 83‑90 | +| **Task 5 – Alert Suppression & Noise Throttling** | `dispatch_alert` (uses `is_alert_suppressed` & `is_alert_deduped`) | Before routing an alert, checks for a manual suppression window (`is_alert_suppressed`) and applies adaptive deduplication (`is_alert_deduped`) based on the current noise score. If allowed, the alert is sent and the noise record is refreshed. | [base.py](file:///c:/Users/bhavi/ObservaKit/alerts/base.py) | 328‑340 | +| **Supporting Helpers** | `is_alert_suppressed` | Queries the `CheckSuppression` table to determine whether alerts for a given table are currently muted. | [base.py](file:///c:/Users/bhavi/ObservaKit/alerts/base.py) | 226‑236 | +| | `is_alert_deduped` | Performs adaptive deduplication using the noise‑aware window, preventing duplicate alerts within the calculated timeframe. | [base.py](file:///c:/Users/bhavi/ObservaKit/alerts/base.py) | 254‑266 | +| | `_refresh_noise_record` | After a successful alert dispatch, recomputes the noise score and updates `AlertNoiseRecord` for future deduplication calculations. | [base.py](file:///c:/Users/bhavi/ObservaKit/alerts/base.py) | 132‑213 | +