The preprocessing pipeline requires proper configuration of several parameters
in the config.yaml file. This guide explains how to configure your pipeline.
!!! note "v0.2.0 Breaking Change"
The configuration system has migrated from `settings.sh` (Bash) to `config.yaml` (YAML). See the [migration guide](#migration-from-settingssh-v01x-to-v020) section below if upgrading from an earlier version.
The main configuration file is config.yaml, which is created by copying
config.template.yaml:
cp config.template.yaml config.yamlThe configuration is loaded by sourcing load_config.sh, which parses the
YAML file and exports environment variables for use in pipeline scripts:
source ./load_config.shSet up your directory structure in the directories section:
directories:
base_dir: '/path/to/your/study'
scripts_dir: '/path/to/your/study/code'
raw_dir: '/path/to/your/study/sourcedata'
trim_dir: '/path/to/your/study'
workflow_log_dir: '/path/to/your/study/logs'
templateflow_host_home: '~/.cache/templateflow'
fmriprep_host_cache: '~/.cache/fmriprep'
freesurfer_license: '~/freesurfer.txt'Path Descriptions:
base_dir: Root directory for the studyscripts_dir: Path of cloned fmriprep-workbench repositoryraw_dir: Raw BIDS-compliant data location (sourcedata)trim_dir: Destination for processed dataworkflow_log_dir: Directory for workflow logstemplateflow_host_home: Host cache directory for TemplateFlow templatesfmriprep_host_cache: fMRIPrep-specific cache directoryfreesurfer_license: Path to your FreeSurfer license file
Configure user-specific settings:
user:
email: 'johndoe@stanford.edu'
username: 'johndoe'
fw_group_id: 'pi'
fw_project_id: 'amass'Configure your task-specific settings in the scan section:
scan:
fw_cli_api_key_file: '~/flywheel_api_key.txt'
fw_url: 'cni.flywheel.io'
config_file: 'scan-config.json'
experiment_type: 'advanced'
task_id: 'OriginalTaskName'
new_task_id: 'cleanname'
n_dummy: 5
run_numbers:
- '01'
- '02'
- '03'
- '04'
- '05'
- '06'
- '07'
- '08'Parameter Descriptions:
task_id: Original task name in BIDS formatnew_task_id: New task name (if renaming needed), otherwise set same value astask_idn_dummy: Number of dummy TRs to remove from the beginning of each runrun_numbers: List of all task BOLD run numbers (as strings with zero-padding)
Set expected volume counts for validation in the validation section:
validation:
expected_fmap_vols: 12
expected_bold_vols: 220
expected_bold_vols_after_trimming: 215These values are used by QC steps (04-qc-metadata and 05-qc-volumes) to verify that your scans have the expected number of volumes.
Map fieldmaps to BOLD runs in the fmap_mapping section:
fmap_mapping:
'01': '01' # TASK BOLD RUN 01 USES FMAP 01
'02': '01' # TASK BOLD RUN 02 USES FMAP 01
'03': '02' # TASK BOLD RUN 03 USES FMAP 02
'04': '02' # TASK BOLD RUN 04 USES FMAP 02
'05': '03' # TASK BOLD RUN 05 USES FMAP 03
'06': '03' # TASK BOLD RUN 06 USES FMAP 03
'07': '04' # TASK BOLD RUN 07 USES FMAP 04
'08': '04' # TASK BOLD RUN 08 USES FMAP 04Each key represents a BOLD run number, and its value is the fieldmap number that covers that run. This mapping determines which fieldmap is used for susceptibility distortion correction for each BOLD run.
Basic subject list in all-subjects.txt:
# This is a comment - these lines are automatically filtered
# Blank lines are also ignored
101
102
103
!!! note "v0.2.0 Enhancement"
Comment lines (starting with `#`) and blank lines are now automatically filtered when counting subjects for SLURM array jobs. This makes it easier to document your subject lists.
You can use suffix modifiers for per-subject control:
101 # Standard subject, runs all steps
102:step4 # Only run step 4 for this subject
103:step4:step5 # Only run steps 4 and 5
104:force # Force rerun all steps
105:step5:force # Only run step 5, force rerun
106:skip # Skip this subject
Available Modifiers:
step1tostep7- Run specific steps onlyforce- Force rerun even if already processedskip- Skip this subject entirely
By default, subjects are pulled from all-subjects.txt. You can optionally
specify different subject lists per pipeline step in config.yaml:
subjects_mapping:
'01-fw2server': '01-subjects.txt'
'02-dcm2niix': '02-subjects.txt'
'06-run-fmriprep': '06-subjects.txt'Set file and directory permissions:
permissions:
dir_permissions: '775'
file_permissions: '775'Configure SLURM job parameters for general tasks:
slurm:
email: 'your.email@institution.edu'
time: '2:00:00'
dcmniix_time: '12:00:00'
mem: '4G'
cpus: 8
array_throttle: 10
log_dir: '/path/to/your/study/logs'
partition: 'partition1,partition2'!!! note "v0.2.0 Change"
SLURM job names now use a unified `fmriprep-workbench-{N}` naming pattern (e.g., `fmriprep-workbench-3` for step 3). The `STEP_NAME` variable is used for directory organization, while `JOB_NAME` is used for SLURM display.
Configure container and derivatives paths:
pipeline:
fmriprep_version: '24.0.1'
derivs_dir: '/path/to/your/study/derivatives/fmriprep-24.0.1'
singularity_image_dir: '/path/to/your/study/containers'
singularity_image: 'fmriprep-24.0.1.simg'
heudiconv_image: 'heudiconv_latest.sif'Configure SLURM parameters specifically for fMRIPrep jobs (steps 6 and 9):
fmriprep_slurm:
job_name: 'fmriprep_yourproject'
time: '48:00:00'
cpus_per_task: 16
mem_per_cpu: '4G'Configure fMRIPrep-specific parameters:
fmriprep:
omp_threads: 8
nthreads: 12
mem_mb: 30000
fd_spike_threshold: 0.9
dvars_spike_threshold: 3.0
output_spaces: 'MNI152NLin2009cAsym:res-2 anat fsnative fsaverage5'Configure default values for FreeSurfer manual editing workflow (Steps 7 and 8):
freesurfer_editing:
remote_server: '' # Remote server hostname
remote_user: '' # Remote username
remote_base_dir: '' # Remote base directory path
local_freesurfer_dir: '~/freesurfer_edits' # Local directory for edits
subjects_list: '' # Default subjects list
download_all: false # Download all subjects by default
upload_all: false # Upload all subjects by default
backup_originals: true # Create backups when uploadingParameter Descriptions:
remote_server: Remote server hostname (e.g.,login.sherlock.stanford.edu)remote_user: Remote username for SSH connection (e.g., SUNet ID)remote_base_dir: Remote base directory containing FreeSurfer outputs (absolute path toBASE_DIRon server)local_freesurfer_dir: Local directory for downloading/uploading edited FreeSurfer outputs (default:~/freesurfer_edits)subjects_list: Default subjects list file or comma-separated subject IDsdownload_all: Download all subjects by default when usingdownload_freesurfer.shupload_all: Upload all subjects by default when usingupload_freesurfer.shbackup_originals: Create timestamped backups of original FreeSurfer outputs before uploading edits (highly recommended)
!!! tip "Configuration Convenience"
Setting these values in `config.yaml` allows you to run the FreeSurfer editing scripts
without specifying common parameters on the command line each time. Command-line arguments
always override config defaults.
!!! warning "Backup Safety"
Keep `backup_originals: true` to prevent accidental data loss. Backups are created
as `{subject}.backup.{timestamp}` on the server before uploading edited surfaces.
misc:
debug: 0 # Debug mode (0=off, 1=on)When debug is set to 1, the pipeline runs with only a single subject
(array index 0) for testing purposes.
If you are upgrading from a version that used settings.sh, follow these steps:
cp config.template.yaml config.yamlMap your old Bash variables to the new YAML structure:
| Old (settings.sh) | New (config.yaml) |
|---|---|
BASE_DIR="/path/to/study" |
directories.base_dir: '/path/to/study' |
task_id="TaskName" |
scan.task_id: 'TaskName' |
n_dummy=5 |
scan.n_dummy: 5 |
run_numbers=("01" "02") |
scan.run_numbers: ['01', '02'] |
declare -A fmap_mapping=(["01"]="01") |
fmap_mapping: {'01': '01'} |
EXPECTED_FMAP_VOLS=12 |
validation.expected_fmap_vols: 12 |
SLURM_EMAIL="email@edu" |
slurm.email: 'email@edu' |
FMRIPREP_VERSION="24.0.1" |
pipeline.fmriprep_version: '24.0.1' |
Note the expanded 14-step workflow:
- Steps 1-5: FlyWheel download, DICOM conversion, prep, and QC (unchanged)
- Step 6: fMRIPrep anatomical-only workflows (
06-run-fmriprep, optional for manual FreeSurfer editing) - Step 7: Download FreeSurfer outputs (
toolbox/download_freesurfer.sh, optional) - Step 8: Upload edited FreeSurfer outputs (
toolbox/upload_freesurfer.sh, optional) - Step 9: fMRIPrep full workflows (
09-run-fmriprep, previously step 7) - Step 10: FSL GLM model setup (
10-fsl-glm/setup_glm.sh, new) - Step 11: FSL Level 1 analysis (
08-run.sbatch, new) - Step 12: FSL Level 2 analysis (
09-run.sbatch, new) - Step 13: FSL Level 3 analysis (
10-run.sbatch, new) - Step 14: Tarball utility (
toolbox/tarball_sourcedata.sh, new)
Once migrated, you can remove the old settings.sh file as it is no longer used.
Before running the pipeline:
-
Verify all paths exist and are accessible
-
Confirm volume counts match your acquisition protocol
-
Test configuration loading:
source ./load_config.sh -
Test on a single subject before batch processing
-
Review logs for configuration warnings
YAML Syntax Errors : Ensure proper YAML formatting. Use a YAML validator if needed. Common issues include incorrect indentation and missing quotes around strings.
Path Issues
: Double-check all path specifications are absolute and accessible. Paths with tildes (~) are expanded automatically.
Volume Mismatches
: Verify validation.expected_fmap_vols and validation.expected_bold_vols match your acquisition protocol.
Fieldmap Mapping
: Ensure each BOLD run has a corresponding fieldmap entry in fmap_mapping. Keys and values should be quoted strings (e.g., '01': '01').
Permission Problems
: Check that permissions.dir_permissions and permissions.file_permissions are appropriate for your cluster environment.
After configuration, see the Usage guide to learn how to run the pipeline.