GoXStream is an open-source, Flink-inspired stream processing engine written in Goβwith a beautiful React dashboard for visual pipeline design, job submission, and history. Build and run real-time data pipelines with file, database, or Kafka sources and sinks.
- Modular pipeline engine (in Go): Compose pipelines from map, filter, reduce, window, time-window, and more
 - Dynamic REST API: Submit pipelines and configure sources, sinks, operators via JSON
 - Pluggable sources/sinks: File, Postgres, Kafka (more coming)
 - Windowing: Tumbling, sliding, time-based, with watermark and late event support
 - Stateful operators and checkpointing (coming soon!)
 - React dashboard: Visual DAG pipeline builder (drag/drop), job submission, job history, JSON preview
 - Persistent job history (localStorage and soon, backend)
 
- Go 1.19 or higher
 
git clone https://github.com/YOUR_GITHUB_USERNAME/goxstream.git
cd goxstream
go mod tidyPlace an example input.csv in the project root for the :
id,name,city
1,Alice,London
2,Bob,Berlin
3,Charlie,Paris
4,David,Berlin
5,Eve,Paris
6,Frank,Paris
7,Grace,Berlingo run ./cmd/goxstream/main.gocd goxstream-dashboard
npm install
npm startSimple Map:
{
  "source": { "type": "file", "path": "input.csv" },
  "operators": [
    { "type": "map", "params": { "col": "processed", "val": "yes" } }
  ],
  "sink": { "type": "file", "path": "output.csv" }
}Windowed Reduce:
{
  "source": { "type": "file", "path": "input.csv" },
  "operators": [
    {
      "type": "time_window",
      "params": {
        "duration": "10s",
        "inner": {
          "type": "reduce",
          "params": { "key": "city", "agg": "count" }
        }
      }
    }
  ],
  "sink": { "type": "file", "path": "output.csv" }
}With Watermark/Late Event Support:
{
  "source": { "type": "file", "path": "input.csv" },
  "operators": [
    {
      "type": "time_window_watermark",
      "params": {
        "duration": "10s",
        "allowed_lateness": "5s",
        "inner": {
          "type": "reduce",
          "params": { "key": "city", "agg": "count" }
        }
      }
    }
  ],
  "sink": { "type": "file", "path": "output.csv" }
}- 
GoXStreamβs UI lets you visually build pipelines (drag/drop), edit operator parameters, and export as JSON to run jobs.
 - 
Submitted jobs and their configs are saved in a beautiful job history.
 
curl -X POST http://localhost:8080/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "file", "path": "input.csv" },
    "operators": [{ "type": "map", "params": { "col": "processed", "val": "yes" } }],
    "sink": { "type": "file", "path": "output.csv" }
  }'
Use curl or Postman to submit a dynamic pipeline (example: sliding window reduce): a. Regular time window:
Input.csv file for both below time_window and time_window_watermark
id,name,city,score,timestamp
1,Alice,London,10,2024-07-04T15:00:00Z
2,Bob,Berlin,15,2024-07-04T15:00:05Z
3,Charlie,Paris,12,2024-07-04T15:00:08Z
4,David,Berlin,8,2024-07-04T15:00:12Z
5,Eve,Paris,14,2024-07-04T15:00:16Z
6,Frank,London,11,2024-07-04T15:00:22Z
7,Grace,Berlin,9,2024-07-04T15:00:29Z
8,Harry,Paris,13,2024-07-04T15:00:35Z
9,Ivy,London,17,2024-07-04T15:00:41Z
10,Jack,Berlin,16,2024-07-04T15:00:43Za. Time window with reduce event support:
curl -X POST http://localhost:8080/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "file", "path": "input.csv" },
    "operators": [
      {
        "type": "time_window",
        "params": {
          "duration": "10s",
          "inner": {
            "type": "reduce",
            "params": { "key": "city", "agg": "count" }
          }
        }
      }
    ],
    "sink": { "type": "file", "path": "output.csv" }
  }'b. Time window with watermark/late event support:
curl -X POST http://localhost:8080/jobs \
  -H "Content-Type: application/json" \
  -d '{
    "source": { "type": "file", "path": "input.csv" },
    "operators": [
      {
        "type": "time_window_watermark",
        "params": {
          "duration": "10s",
          "allowed_lateness": "5s",
          "inner": {
            "type": "reduce",
            "params": { "key": "city", "agg": "count" }
          }
        }
      }
    ],
    "sink": { "type": "file", "path": "output.csv" }
  }'Your results will be in output.csv with a window_id column.
city,count,window_end,window_id
London,1,2024-07-04T15:00:10Z,1
Berlin,1,2024-07-04T15:00:10Z,1
Paris,1,2024-07-04T15:00:10Z,1
...[Source] --> [Map] --> [Filter] --> [Window/Reduce] --> [Sink]
   |            |         |           |                   |
 [File]   [Add/Transform] [Select] [Tumbling/Sliding]  [File]
Operators: Implemented as Go interfaces, dynamically composed at runtime.
REST API: Accepts JSON job specs, launches pipeline as background Go routines.
A pipeline is defined by a simple JSON:
{
  "source": {"type": "file", "path": "input.csv"},
  "operators": [
    {"type": "map", "params": {"col": "processed", "val": "yes"}},
    {"type": "filter", "params": {"field": "city", "eq": "Berlin"}},
    {
      "type": "sliding_window",
      "params": {
        "size": 3,
        "step": 1,
        "inner": {
          "type": "reduce",
          "params": {"key": "city", "agg": "count"}
        }
      }
    }
  ],
  "sink": {"type": "file", "path": "output.csv"}
}| Type             | Description              | Example Params                        |
| ---------------- | ------------------------ | ------------------------------------- |
| map              | Add or transform columns | `col`, `val`                          |
| filter           | Filter rows by condition | `field`, `eq`                         |
| reduce           | Aggregate/group by field | `key`, `agg` (`count`, future: `sum`) |
| tumbling\_window | Non-overlapping windows  | `size`, `inner`                       |
| sliding\_window  | Overlapping windows      | `size`, `step`, `inner`               |Add New Operators: Implement the Operator interface and add a factory to the operator registry.
Support New Sources/Sinks: Implement a Source or Sink interface in internal/source or internal/sink.
React UI Integration: Planned for interactive pipeline creation and monitoring.
- 
Dynamic operator/source/sink registry
 - 
File/DB/Kafka connectors
 - 
REST API & React UI for pipeline design and monitoring
 - 
Visual DAG editor with drag/drop (reactflow)
 - 
Windowing and watermark support
 - 
Persistent job history (localStorage)
 - 
Checkpoint and fault-tolerance (coming next)
 - 
Backend job monitoring/status APIs
 - 
Multi-job/cluster execution
 - 
More analytics and ML operators
 
PRs, issues, and ideas are welcome! Fork and submit improvements or new featuresβletβs build a great Go stream engine together!
