Healthcare ML Training Pipeline

Serverless GPU training infrastructure for healthcare NLP models. Training runs on RunPod serverless GPUs, with trained models stored in S3.

Overview

This project provides production-ready ML pipelines for training healthcare classification models:

Drug-Drug Interaction (DDI) - Severity classification from DrugBank (176K samples)
Adverse Drug Events (ADE) - Binary detection from ADE Corpus V2 (30K samples)
Medical Triage - Urgency level classification
Symptom-to-Disease - Diagnosis prediction (41 disease classes)

All models use Bio_ClinicalBERT as the base and are fine-tuned on domain-specific datasets.

Training Results

Task	Dataset	Samples	Accuracy	F1 Score
DDI Classification	DrugBank	176K	100%	100%
ADE Detection	ADE Corpus V2	9K	93.5%	95.3%
Symptom-Disease	Disease Symptoms	4.4K	100%	100%

Quick Start

Run Training

curl -X POST "https://api.runpod.ai/v2/YOUR_ENDPOINT/run" \
  -H "Authorization: Bearer $RUNPOD_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "task": "ddi",
      "model_name": "emilyalsentzer/Bio_ClinicalBERT",
      "max_samples": 10000,
      "epochs": 3,
      "batch_size": 16,
      "s3_bucket": "your-bucket",
      "aws_access_key_id": "...",
      "aws_secret_access_key": "...",
      "aws_session_token": "..."
    }
  }'

Available tasks: ddi, ade, triage, symptom_disease

Download Trained Model

aws s3 cp s3://your-bucket/model.tar.gz .
tar -xzf model.tar.gz

Project Structure

├── components/
│   └── runpod_trainer/
│       ├── Dockerfile
│       ├── handler.py          # Multi-task training logic
│       ├── requirements.txt
│       └── data/               # DrugBank DDI dataset
├── pipelines/
│   ├── healthcare_training.py  # Kubeflow pipeline definitions
│   ├── ddi_training_runpod.py
│   └── ddi_data_prep.py
├── .github/workflows/
│   └── build-trainer.yaml      # CI/CD
└── manifests/
    └── argocd-app.yaml

Configuration

All configuration is via environment variables. Copy .env.example to .env and fill in your values:

cp .env.example .env
# Edit .env with your credentials

Environment Variables

Variable	Required	Default	Description
`RUNPOD_API_KEY`	Yes	-	RunPod API key
`RUNPOD_ENDPOINT`	Yes	-	RunPod serverless endpoint ID
`AWS_ACCESS_KEY_ID`	Yes	-	AWS credentials for S3
`AWS_SECRET_ACCESS_KEY`	Yes	-	AWS credentials for S3
`AWS_SESSION_TOKEN`	No	-	For assumed role sessions
`AWS_REGION`	No	us-east-1	AWS region
`S3_BUCKET`	Yes	-	Bucket for model artifacts
`BASE_MODEL`	No	Bio_ClinicalBERT	HuggingFace model ID
`MAX_SAMPLES`	No	10000	Training samples
`EPOCHS`	No	3	Training epochs
`BATCH_SIZE`	No	16	Batch size

Kubernetes Secrets (Recommended)

For production, use Kubernetes secrets instead of environment variables:

apiVersion: v1
kind: Secret
metadata:
  name: ml-pipeline-secrets
type: Opaque
stringData:
  RUNPOD_API_KEY: "your-key"
  AWS_ACCESS_KEY_ID: "your-key"
  AWS_SECRET_ACCESS_KEY: "your-secret"

Supported Models

Model	Type	Use Case
`emilyalsentzer/Bio_ClinicalBERT`	BERT	Classification tasks
`meta-llama/Llama-3.1-8B-Instruct`	LLM	Text generation (LoRA)
`google/gemma-3-4b-it`	LLM	Lightweight inference

Parameters

Parameter	Default	Description
`task`	ddi	Training task
`model_name`	Bio_ClinicalBERT	HuggingFace model ID
`max_samples`	10000	Training samples
`epochs`	3	Training epochs
`batch_size`	16	Batch size
`eval_split`	0.1	Validation split
`s3_bucket`	-	S3 bucket for output

Development

# Build container
cd components/runpod_trainer
docker build -t healthcare-trainer .

# Trigger CI build
gh workflow run build-trainer.yaml

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
components/runpod_trainer		components/runpod_trainer
manifests		manifests
pipelines		pipelines
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
USE_CASES.md		USE_CASES.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Healthcare ML Training Pipeline

Overview

Training Results

Quick Start

Run Training

Download Trained Model

Project Structure

Configuration

Environment Variables

Kubernetes Secrets (Recommended)

Supported Models

Parameters

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Healthcare ML Training Pipeline

Overview

Training Results

Quick Start

Run Training

Download Trained Model

Project Structure

Configuration

Environment Variables

Kubernetes Secrets (Recommended)

Supported Models

Parameters

Development

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages