SciDER: Scientific Data-centric End-to-end Researcher

Installation

You can install the project using pip:

# from git
pip install git+https://github.com/leonardodalinky/SciDER
# locally
pip install -e .

Example Usage:

from scider.default.models import register_gemini_medium_high_models
from scider.workflows import run_full_workflow

# 1. Register the models you want to use
register_gemini_medium_high_models()
# 2. Run the full workflow
wf = run_full_workflow(
    data_path="/path/to/data/",
    workspace_path="/path/to/workspace/",
    user_query="Discover insights about RAG",
)
# 3. The final state after the workflow
print(wf.final_summary)

Workflows

SciDER provides six workflows in scider.workflows:

Workflow	Description
`IdeationWorkflow`	Generate research ideas from literature search.
`DataWorkflow`	Analyze a dataset and produce a structured summary.
`HypoDataWorkflow`	Generate synthetic data from a feature description, then analyze it.
`ExperimentWorkflow`	Implement and run an experiment given a data summary.
`FullWorkflow`	Data analysis -> experiment execution.
`FullWorkflowWithIdeation`	Ideation -> (optional) data analysis -> (optional) experiment. Each phase can be skipped via flags.

Each workflow has a class form (FooWorkflow) and a convenience function (run_foo_workflow).

Configuration

The project is configured using environment variables. You can set these variables in a .env file at the root of the project. A template .env.template is provided for reference.

Also, you can set environment variables directly in your shell or terminal session.

Web UI

The web UI is a Streamlit application. Deploy it using the Dockerfile at the project root.

Create a .env file at the project root (copy from .env.template) and fill in your API keys.
Build the image:

docker build -t scider:latest .

Run the container:

docker run -d \
  --name scider \
  -p 7860:7860 \
  --env-file .env \
  scider:latest

Access the UI at http://localhost:7860.

UI Example:


Select workflow type and Get started	Case study selection and Full workflow

Coding Backend

The experiment agent delegates code implementation to a coding subagent. Three backends are available, selectable via the CODING_AGENT_VERSION environment variable:

Backend	Value	Description
Claude Agent SDK (default)	`claude_sdk`	Delegates to Claude Agent SDK. Requires `pip install claude-agent-sdk` and `ANTHROPIC_API_KEY`.
Native	`native`	SciDER's built-in coding agent. Uses the `experiment_coding` model role with any LiteLLM-supported provider. No external dependencies. Pick this if you want a non-Claude provider (Gemini, GPT, etc.).
OpenHands	`openhands`	Delegates to OpenHands sandbox. Requires `SCIDER_ENABLE_OPENHANDS=1`.

Set CODING_AGENT_VERSION in .env to switch backends.

Skills

Skills are markdown files with YAML frontmatter that inject domain-specific guidance into an agent's system prompt. Modeled after Claude Code, they can be either preloaded (full content injected) or on-demand (listed by name, loaded via the Skill tool when needed).

Discovery

On startup, SciDER walks up from the workspace directory to the filesystem root (plus ~), scanning .scider/skills/ at each level. Closer directories override identically-named skills from parents. Supported layouts:

.scider/skills/
├── my-skill/
│   ├── SKILL.md              # directory format — can bundle reference files
│   └── references/
│       └── usage.md
└── another.md                # single-file format

Frontmatter fields:

---
name: my-skill
description: One-line summary shown in the on-demand listing.
allowed_agents: [data, experiment]   # omit → available to all agents
preload_for: [data]                  # omit → on-demand only (must be called via Skill tool)
---

For directory-format skills, SciDER automatically injects Base directory for this skill: <absolute path> at the top of the content so the model can resolve relative file references (e.g. references/usage.md) via the Read tool.

Dynamic Registration

You can also register skills programmatically, overriding frontmatter fields:

from scider.core.skills import SkillRegistry

# Single directory
SkillRegistry.instance().register_skill_dirs(
    "path/to/my-skill",
    allow=["experiment", "native_coding"],
    preload_for=["experiment"],
)

# Multiple directories at once
SkillRegistry.instance().register_skill_dirs(
    ["path/to/skill-a", "path/to/skill-b"],
    allow=["data"],
)

allow restricts which agents see the skill; preload_for controls which agents get the full content in their system prompt. Both accept a Literal of the valid agent names (ideation, data, experiment, experiment_coding, native_coding, critic, paper_search) for static type checking. Passing None for either keeps the value from the SKILL.md frontmatter.

Development Guide

First, install pre-commit:

pip install pre-commit

Install pre-commit to format code:

pre-commit install

Then, copy .env.template to .env and fill in the necessary values.

Finally, run the following command to sync dependencies:

# for cpu
uv sync --extra cpu

# for mac
uv sync --extra mac

# for gpu
uv sync --extra cu128

# streamlit client
uv sync --extra streamlit

Run tests with:

uv run pytest tests/

Benchmarks

See BENCHMARKS for details on the benchmarks we have conducted to evaluate SciDER's performance.

Feedback and Contributions

We welcome contributions to improve SciDER. Please open an issue or submit a pull request on our GitHub repository.

Also, any feedback on the project is greatly appreciated. You can fill the feedback form to rate this app and help to improve the project.

Name		Name	Last commit message	Last commit date
Latest commit History 197 Commits
.scider		.scider
.vscode		.vscode
bench_workflows		bench_workflows
benchmarks		benchmarks
case-study-memory		case-study-memory
model_settings		model_settings
scider		scider
static/images		static/images
streamlit-client		streamlit-client
tests		tests
.dockerignore		.dockerignore
.env.template		.env.template
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-space.txt		requirements-space.txt
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SciDER: Scientific Data-centric End-to-end Researcher

Table of Contents

Installation

Workflows

Configuration

Web UI

Coding Backend

Skills

Discovery

Dynamic Registration

Development Guide

Benchmarks

Feedback and Contributions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SciDER: Scientific Data-centric End-to-end Researcher

Table of Contents

Installation

Workflows

Configuration

Web UI

Coding Backend

Skills

Discovery

Dynamic Registration

Development Guide

Benchmarks

Feedback and Contributions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages