GeoFetcheroo is a command‑line utility that bridges GEO/dbGAP and SRA to give you all the sequencing run details you need.
- One‑step workflow: From a GEO Series (
GSE) to SRR runs and human‑readable sample names in a single script. - No manual scraping: Automates FTP, HTML or CSV parsing by using R’s GEOquery and NCBI’s EDirect tools.
- Tab‑delimited output: Easy to import into spreadsheets, R/Python data frames or downstream pipelines.
- R (≥ 4.0)
- esearch
sh -c "$(curl -fsSL https://ftp.ncbi.nlm.nih.gov/entrez/entrezdirect/install-edirect.sh)"
- GEOquery R package
install.packages("GEOquery")
Clone the repo:
git clone https://github.com/jssprrt/GeoFetcheroo.git
cd GeoFetcheroo
chmod +x GeoFetcheroo.shExecute:
./GeoFetcheroo.sh GSE193201Output:
Found GSM IDs: GSM5776310 GSM5776311 GSM5776312 GSM5776313 GSM5776314 GSM5776315 GSM5776322 GSM5776323 GSM5776324 GSM5776325 GSM5776326 GSM5776327 GSM5776328 GSM5776329 GSM5776330 GSM5776331 GSM5776332 GSM5776333 GSM5942920 GSM5942921 GSM5942922
Derived SRX IDs: SRX13652506 SRX13652507 SRX13652508 SRX13652509 SRX13652510 SRX13652511 SRX13652518 SRX13652519 SRX13652520 SRX13652521 SRX13652522 SRX13652523 SRX13652524 SRX13652525 SRX13652526 SRX13652527 SRX13652528 SRX13652529 SRX14421485 SRX14421486 SRX14421487
GSE SRX SRR BioSample BioSampleTitle
GSE193201 SRX13652506 SRR17482016 SAMN24719625 CFG2295 PM18_D28_5F_RNAseq
GSE193201 SRX13652507 SRR17482015 SAMN24719624 CFG2296 PM18_D28_5F-O_RNAseq
GSE193201 SRX13652508 SRR17482014 SAMN24719623 CFG2334 PM18_D28_5F-K_RNAseq
GSE193201 SRX13652509 SRR17482013 SAMN24719622 CFG2336 PM10_D28_5F_RNAseq
GSE193201 SRX13652510 SRR17482012 SAMN24719621 CFG2337 PM10_D28_5F-K_RNAseq
GSE193201 SRX13652511 SRR17482011 SAMN24719620 CFG2338 PM10_D28_5F-O_RNAseq
GSE193201 SRX13652518 SRR17482004 SAMN24719613 CFG2351 PM18_D28_5F_PEF_RRBS
GSE193201 SRX13652519 SRR17482003 SAMN24719612 CFG2352 PM18_D28_5F-O_PEF_RRBS
GSE193201 SRX13652520 SRR17482002 SAMN24719611 CFG2353 PM18_D28_5F-K_PEF_RRBS
GSE193201 SRX13652521 SRR17482001 SAMN24719610 CFG2354 PM10_D28_5F_PEF_RRBS
GSE193201 SRX13652522 SRR17482000 SAMN24719609 CFG2355 PM10_D28_5F-O_PEF_RRBS
GSE193201 SRX13652523 SRR17481999 SAMN24719608 CFG2356 PM10_D28_5F-K_PEF_RRBS
GSE193201 SRX13652524 SRR17481998 SAMN24719607 CFG2351-seq2 PM18_D28_5F_PEF_RRBS
GSE193201 SRX13652525 SRR17481997 SAMN24719606 CFG2352-seq2 PM18_D28_5F-O_PEF_RRBS
GSE193201 SRX13652526 SRR17481996 SAMN24719605 CFG2353-seq2 PM18_D28_5F-K_PEF_RRBS
GSE193201 SRX13652527 SRR17481995 SAMN24719604 CFG2354-seq2 PM10_D28_5F_PEF_RRBS
GSE193201 SRX13652528 SRR17481994 SAMN24719603 CFG2355-seq2 PM10_D28_5F-O_PEF_RRBS
GSE193201 SRX13652529 SRR17481993 SAMN24719602 CFG2356-seq2 PM10_D28_5F-K_PEF_RRBS
GSE193201 SRX14421485 SRR18283673 SAMN26545672 PM9_D28_5F_RNAseq
GSE193201 SRX14421486 SRR18283672 SAMN26545671 PM9_D28_5F-O_RNAseq
GSE193201 SRX14421487 SRR18283671 SAMN26545670 PM9_D28_5F-K_RNAseqRunning for dbGAP:
./GeoFetcheroo.sh --dbgap phs001431