Bioinformatics & protein-engineering engineer. I build reproducible pipelines for sequencing data and design loops for protein discovery β Nextflow on the wet-lab side, ESM-family models on the design side.
𧬠nf-rnaseq-quant
Bulk RNA-seq transcript quantification in Nextflow DSL2 β FastQC + fastp + Salmon + MultiQC. Docker / Singularity / Conda / Slurm profiles, end-to-end CI on a yeast test dataset.
π§ͺ nf-variant-calling
Germline short-variant calling in Nextflow DSL2 β BWA-MEM2 β samtools markdup β bcftools mpileup/call β MultiQC. Tunable ploidy (diploid default, haploid for bacterial/viral), per-process resource tiers, end-to-end CI on a SARS-CoV-2 mini-dataset.
π¬ esm-design
Protein sequence design loop in Python β single-point variant generation, ESMC pseudo-log-likelihood scoring, ESMFold structure prediction + pLDDT ranking. Library + esm-design CLI; CI across Python 3.10β3.12.
π§« seq2func
De novo transcriptome assembly & protein annotation platform β Nextflow pipeline backing a Next.js web app for browsing assemblies, ORFs, and functional annotations.
β‘ RSO
Modal app for protein-binder sequence design using the RSO method on top of ColabDesign β give it a target PDB, get back designed binder sequences.
Pipelines: Nextflow (DSL2), Docker, Singularity, Slurm, Modal, conda Bioinformatics: BWA-MEM2, samtools, bcftools, Salmon, fastp, FastQC, MultiQC, BLAST, TransDecoder Protein / ML: ESMC, ESMFold, ColabDesign, BindCraft, PyTorch, transformers Apps: Python (typed, tested), Next.js / TypeScript, REST + GraphQL
github.com/coreyhowe999 Β· open to bioinformatics / ML-for-bio roles
