FHIR-Former

FHIR-Former is a transformer-based model for processing and analyzing FHIR (Fast Healthcare Interoperability Resources) data. It provides tools for pretraining models on FHIR data and documents, as well as downstream tasks like ICD coding, image analysis, readmission prediction, and mortality prediction.

Features

Pretraining on FHIR resources
Pretraining on clinical documents
Combined pretraining on FHIR and documents
Downstream tasks:
- ICD coding
- Medical image analysis
- Readmission prediction
- Mortality prediction
- Main ICD prediction
Live inference capabilities for FHIR server integration

Installation

Using Poetry (Recommended)

# Clone the repository
git clone https://github.com/UMEssen/fhirformer.git
cd fhirformer

# Install with Poetry
poetry install

Using Pip

# Clone the repository
git clone https://github.com/UMEssen/fhirformer.git
cd fhirformer

# Install with pip
pip install -e .

Usage

Command Line Interface

# Using Poetry
poetry run fhirformer --task [task_name] [options]

# If installed with pip
fhirformer --task [task_name] [options]

Available tasks:

pretrain_fhir: Pretrain on FHIR resources
pretrain_documents: Pretrain on clinical documents
pretrain_fhir_documents: Pretrain on both FHIR and documents
ds_icd: Downstream task for ICD coding
ds_image: Downstream task for image analysis
ds_readmission: Downstream task for readmission prediction
ds_mortality: Downstream task for mortality prediction
ds_main_icd: Downstream task for main ICD prediction

Common options:

--root_dir: Specify the root directory for data and outputs
--wandb: Enable Weights & Biases logging
--model_checkpoint: Path to trained model or huggingface model name
--debug: Run in debug mode
--step: Specify steps to run (data, sampling, train, test, all)
--max_train_samples: Maximum number of training samples
--run_name: Custom name for the run
--live_inference: Enable live inference mode
--use_*: Toggle specific FHIR resources (e.g., --use_imaging_study, --use_condition, etc.)

Live Inference

FHIR-Former supports live inference from FHIR servers. When using --live_inference, the model will:

Download ongoinng encounters from FHIR
Generate "live" samples
Make predictions
Push predictions as RiskAssesment resource to FHIR

Example for image prediction task:

python -m fhirformer \
    --live_inference \
    --task ds_image \
    --use_imaging_study=True \
    --use_episode_of_care=True \
    --wandb_artifact="ship-ai-autopilot/fhirformer_ds_v2/model-o1u3iat3:v1"

This command will:

Enable live inference mode
Use the image analysis task
Process imaging studies and episode of care data
Load the specified model from Weights & Biases artifacts
Make predicitons and send them to FHIR

Important: The --wandb_artifact parameter is required for live inference. It specifies which trained model to use for predictions. It is cached once it is downloaded once.

Docker

# Run with specific GPUs
GPUS=0,1,2 docker compose run trainer bash

# Inside the docker container
python -m fhirformer --task [task_name]

Configuration

Configuration files are stored in the fhirformer/config directory. You can modify these files to customize the behavior of the models and training processes.

The main configuration file is config_training.yaml which contains:

Data configurations
Model parameters
Training settings
Task-specific configurations

Development

Setup Development Environment

# Install development dependencies
poetry install --with dev

# Set up pre-commit hooks
pre-commit install

Creating New Downstream Tasks

To create a new downstream task:

Add your task configuration in fhirformer/config/config_training.yaml:

data_id: {
    // ... existing tasks ...
    "ds_your_task": "V1"  # Add your task here
}

resources_for_task: {
    "ds_your_task": [
        # List required FHIR resources for your task
        "condition",
        "procedure",
        # Add other needed resources
    ]
}

Create a new task builder class that inherits from EncounterDatasetBuilder:

from fhirformer.data_preprocessing.encounter_dataset_builder import EncounterDatasetBuilder

class YourTaskBuilder(EncounterDatasetBuilder):
    def process_patient(self, patient_id: str, datastore: DataStore) -> List[Dict]:
        # Implement your task-specific patient processing logic
        # Must return a list of dictionaries containing:
        # - patient_id: str
        # - text: str (input text)
        # - labels: Any (task labels)
        pass

    def global_multiprocessing(self):
        # Implement multiprocessing logic if needed
        # Usually can reuse parent class implementation
        pass

Register your task in the CLI:

# In fhirformer/cli.py
pipelines = {
    // ... existing tasks ...
    "ds_your_task": {
        "generate": your_task_generator.main,
        "train": ds_single_label.main,  # or ds_multi_label.main
    }
}

Run your task:

poetry run fhirformer --task ds_your_task

Key considerations when creating a task:

Define required FHIR resources in config_training.yaml
Implement data processing logic in process_patient()
Structure output as {patient_id, text, labels}
Choose appropriate training pipeline (single_label or multi_label)

Code Quality Tools

Black for code formatting
Flake8 for linting
MyPy for type checking

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use FHIR-Former in your research, please cite:

@software{fhirformer2024,
  author = {Engelke, Merlin, Baldini, Giulia, Jens Kleesiek, Felix Nensa, Amin Dada},
  title = {Improving Clinical Decision Making with FHIR and Large Language Models},
  year = {2024},
  publisher = {University Hospital Essen},
  url = {https://github.com/UMEssen/fhirformer}
}

Contributors

Merlin Engelke (Merlin.Engelke@uk-essen.de)
Giulia Baldini (Giulia.Baldini@uk-essen.de)

Name		Name	Last commit message	Last commit date
Latest commit History 334 Commits
fhirformer		fhirformer
.flake8		.flake8
.gitignore		.gitignore
.mypy.ini		.mypy.ini
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FHIR-Former

Features

Installation

Using Poetry (Recommended)

Using Pip

Usage

Command Line Interface

Live Inference

Docker

Configuration

Development

Setup Development Environment

Creating New Downstream Tasks

Code Quality Tools

License

Citation

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FHIR-Former

Features

Installation

Using Poetry (Recommended)

Using Pip

Usage

Command Line Interface

Live Inference

Docker

Configuration

Development

Setup Development Environment

Creating New Downstream Tasks

Code Quality Tools

License

Citation

Contributors

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages