Skip to content
View xX-its-amit-Xx's full-sized avatar
  • Northeastern University

Highlights

  • Pro

Block or report xX-its-amit-Xx

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
xX-its-amit-Xx/README.md

Amit Shenoy

Mission: Solve as many puzzles as possible.

Computational biologist working the seams between bench biology, structural biophysics, and machine learning at scale. B.S. in Computational Bioengineering @ Northeastern (Dec 2025) and currently pursuing a thesis-track M.S. in Bioengineering (Expected Grad May 2027).

Boston, MAashenoycompany@gmail.comLinkedIn


What I'm building right now

🧬 Single-cell & multi-omics pipelines

  • scRNA-Flow — end-to-end scRNA-seq pipeline (Scanpy/AnnData): filtering, normalization, doublet detection, integration, clustering, marker gene analysis.
  • ATAC-QC — chromatin accessibility QC + peak analysis with TSS enrichment, fragment-size distributions, nucleosome banding diagnostics.
  • OmicsQC — sequencing QC across bulk and single-cell modalities; detects low-quality cells, ambient RNA contamination, batch effects.

💊 ML for drug discovery

  • OpenADMET-PXR — solo entry in the Hugging Face OpenADMET competition (top 100). 35+ modeling approaches for pEC50 prediction on Pregnane X Receptor, including ChemBERTa fine-tunes, ECFP+LightGBM, and hybrid transformer architectures fusing protein and compound embeddings. Currently exploring activity-cliff-aware pretraining to sensitize representations to small structural changes with large activity shifts.
  • ARES + BayesBio — Bayesian inference + ML pipelines for compound prioritization with calibrated uncertainty.

🎮 Outside the lab

  • OrbitSwap (Just Another Studios) — gravity-puzzle game shipped on Android. Kotlin, original mechanic design.

Where I've worked

Where When What
UCB Biosciences (Computational Data Science Co-op) 2025 Multimodal drug-discovery data; UMAP + SHAP-driven artifact detection; ML compound prioritization.
COMBINE Lab, Northeastern (Undergrad Researcher, Dr. Mona Minkara) 2023–2025 Reproducible HPC pipelines on SLURM; molecular modeling; 15+ technical talks; 2 ongoing manuscripts.
Arbor Biotechnologies (Gene Therapy Co-op) Jan–Jun 2024 DOE-driven AAV production optimization; >250% productivity improvement.

What I'm interested in

My emerging research thesis: knowledge graphs are the right substrate for integrating multi-omics. Clinical, molecular, phenomic, metabolomic, pathway, and simulation data each carry the experimental context that produced them. AI agents traversing those graphs can surface connections that no single human can hold in their head — which matters most in domains like immunology, where heterogeneity, tissue context, and patient variability make the problem combinatorial.

If you're working on anything in that direction, I'd love to hear from you.


Conference presentations

  • AAAS 2025 (American Association for the Advancement of Science) — presenter
  • RISE Northeastern — 2024, 2025
  • MBN Biophysics Conference — 2024
  • Source Fair, Northeastern — presenter

Tech stack

Python (pandas, numpy, scipy, scikit-learn, Scanpy, AnnData, PyTorch) • RSQLBash/SLURMC++Kotlin • Docker • Conda • Jupyter • Cursor • Claude Code • Schrödinger Maestro/Glide/Prime • RDKit • Benchling


"The interesting questions in biology live in the gaps between disciplines."

Popular repositories Loading

  1. amit-sh amit-sh Public

    ~/amit.sh — personal hub. built with care, caffeine, and an unreasonable number of terminal prompts.

    JavaScript 1

  2. OmicsQC OmicsQC Public

    OmicsQC processes FASTQ and FASTA files and generates basic QC summaries, plots, and a clean HTML or Markdown report.

    Python 1

  3. scRNA-flow scRNA-flow Public

    Analyze a public single-cell RNA-seq dataset from raw or preprocessed count matrices, perform quality control, normalization, dimensionality reduction, clustering, marker gene detection, and genera…

    Python 1

  4. ATAC-QC ATAC-QC Public

    Processes ATAC-seq analysis outputs and generates interpretable QC metrics, plots, and reports. A clear educational and practical toolkit that wraps common analysis steps and explains QC.

    Python 1

  5. BayesBio BayesBio Public

    Infers biological model parameters from noisy experimental data using Bayesian statistics, Markov Chain Monte Carlo, and clear visual diagnostics.

    Python 1

  6. pxr-effector-uncoupling pxr-effector-uncoupling Public

    PXR is a high-value but undruggable-systemically target — agonists are therapeutic in IBD and MASH but toxic in liver via CYP3A4 induction. This project quantifies, across human cell types, which c…

    HTML 1