BU-BMSIP / Perissi Lab (Boston University Chobanian & Avedisian School of Medicine)
Reproducible multi-omics analysis framework for ChIP-seq and RNA–chromatin interaction (iMARGI) data, focusing on GPS2-mediated mitochondrial retrograde signaling and its interaction with ATF4, CTCF, and related transcription factors.
Mitochondria depend on nuclear-encoded proteins, requiring tight coordination of gene expression via anterograde (nucleus→mitochondria) and retrograde (mitochondria→nucleus) signaling.
GPS2 (G-protein Pathway Suppressor 2) is a key retrograde mediator that translocates to chromatin under mitochondrial stress, potentially cooperating with stress-response TFs such as ATF4/ATF5 and the chromatin organizer CTCF.
This repository contains:
- Snakemake-based workflows for scalable, reproducible processing of:
- GPS2, ATF4, CTCF ChIP-seq datasets (in-house + public)
- iMARGI RNA–DNA interaction data (including mtRNA–DNA contacts)
- Integrated analyses:
- Peak calling, intersection, and annotation
- Motif discovery and promoter topology analysis
- Signal profiling and heatmaps (deepTools)
- Functional enrichment (GO/KEGG, GSEA)
- Mapping of mtRNA–DNA interactions and overlap with TF peaks
├── docs/ # Documentation, diagrams, protocol notes
├── envs/ # Conda environment YAMLs (bedtools, deeptools, macs3, homer, etc.)
├── notebooks/ # Jupyter/R notebooks (QC, annotation, enrichment, motif)
├── profile/ # Snakemake profiles (threads, paths, runtime configs)
├── scripts/ # Python/R/Bash utilities & Snakemake workflow files
│ ├── ChIP_CTCF_mouse_cleaned.smk
│ ├── iMargi.smk
│ └── ...
├── adapters_and_annotations/ # Reference FASTA/BED, genome annotations, blacklists
├── Peak_Calls_with_Control.csv
├── Peak_Calls_without_Control.csv
├── Sample_Data_for_CTCF.csv
├── Updated_Sample_Sheet.csv
└── README.md
- Access to BU Shared Computing Cluster (SCC) or equivalent HPC
- Conda (≥4.10)
- Snakemake (≥7.0)
- Required tools:
bedtools,deeptools,macs3,homer,samtools,pairtools,bwa
Activate environment before running:
conda activate chip_seqDry run first (recommended):
snakemake -s scripts/ChIP_CTCF_mouse_cleaned.smk --profile profile -npActual run:
snakemake -s scripts/ChIP_CTCF_mouse_cleaned.smk --profile profileFor iMARGI workflow:
snakemake -s scripts/iMargi.smk --profile iMargi- Place FASTQ files:
- Raw (unmerged):
CTCF_3T3L1/raw_samples/ - Pre-merged:
CTCF_3T3L1/samples/
- Raw (unmerged):
- Or add FTP links + SRR IDs to
Sample_Data_for_CTCF.csvfor auto-download & merge - Register sample in
Updated_Sample_Sheet.csv - Update peak calling configs:
- With control →
Peak_Calls_with_Control.csv - Without control →
Peak_Calls_without_Control.csv
- With control →
- Edit
scripts/iMargi.smk→ update "Global config" with new SRA IDs - Follow iMARGI preprocessing steps (cleaning, mapping, parsing)
- Alignment & QC
*.bam,multiqc_report.html - Coverage tracks
*.bw - Peak calls
*_peaks.narrowPeak - Annotation
results/annotation/*.bed& gene lists - Motifs
motifs/**/knownResults.txt - Signal profiling
matrix/*.gz,plots/*.png - Enrichment analysis
results/Enrichment/*.tsv - iMARGI outputs
output/final_*.pairs.gz+ promoter overlap stats
- GPS2–ATF4 co-binding during adipocyte differentiation
- CTCF motif enrichment flanking GPS2 peaks
- Condition-specific GPS2 occupancy shifts (day 0 vs day 6)
- mtRNA-binding sites overlapping GPS2/NCOR peaks (T263 endothelial cells)
- Cardamone et al., Mol Cell, 2018 — GPS2 retrograde signaling mechanism
- Chen et al., Cell Biol Toxicol, 2022 — ATF4–CTCF cooperation in adipogenesis
- iMARGI Pipeline — Yan Lab Documentation