Glittr - a large searchable community database of git repositories with bioinformatics training material
DReSA - Digital Research Skills Australasia - an active database of training material and national workshop events
In the wet lab, you might need to learn to pipette, use a centrifuge, or maybe run some gel electrophoresis before you can get useful results. In the dry lab, you need to learn to use a computer to automate tasks and analyse data before you can get useful results - often this means telling the computer what to do using plain text.
- https://linuxjourney.com/
- https://sandbox.bio/ - interactive commandline tutorials for bioinformatics
- https://www.edx.org/course/introduction-linux-linuxfoundationx-lfs101x-1
- http://andrewjrobinson.github.io/training_docs/tutorials/unix/
- http://andrewjrobinson.github.io/training_docs/tutorials/hpc/
- https://learngitbranching.js.org/ - a visual, interactive tutorial to help you understand Git
-
Introduction to R is our recommended starting point.
-
Programming and Tidy Data analysis in R covers automating tasks such as loading a large set of files, as well as data wrangling and how "tidy" data streamlines visualization and data analysis.
-
Linear models in R covers many common statistical tasks in a unified way using "linear models". This will also be useful background knowledge for RNA-Seq analysis, especially with complex experimental designs.
-
Introduction to R Shiny covers presentation of your data interactively.
-
Working with DNA sequences and features in R with Bioconductor
- Introduction to R - Tidyverse, a workshop developed at WEHI.
- The R for Data Science book is a popular book covering the tidy approach.
- Posit Cloud to try R and RStudio online.
- R programming - coursera: https://www.coursera.org/course/rprog
- Data Carpentry R lessons: https://datacarpentry.org/R-ecology-lesson/ and https://datacarpentry.org/genomics-r-intro/
- It's often useful to generate reports including code and outputs, either using RMarkdown or the newer Quarto system.
- StatsTest (Wayback Machine archive) - which statistical test should you use? (Paul's comment: Don't be overwhelmed, learn to use linear models. Linear models provide a systematic way of thinking about the factors and variables in your experiment, and how they translate into a statistical model and tests. Then if the assumptions aren't quite met for using a linear model, consider one of the specialized methods listed in this page.)
- End-to-end visualisation using ggplot2; https://rviews.rstudio.com/2017/08/14/end-to-end-visualization-using-ggplot2/
- HarvardX biomedical data science MOOC: http://genomicsclass.github.io/book/ (Chapter 5 has good examples of linear model design and contrasts - with diagrams)
- Monash Data Fluency Introduction to Python Workshop material
- ... uses some material from: Data Analysis and Visualization in Python for Ecologists
- The 'official' Python tutorial: https://docs.python.org/3/tutorial/
- http://rosalind.info/problems/list-view/?location=python-village
- http://andrewjrobinson.github.io/training_docs/tutorials/python_overview/python_overview/ - more of a quickstart for those comfortable with programming
- Intro to Python: http://introtopython.org/
- Introduction to Data Processing with Python: http://opentechschool.github.io/python-data-intro/
- Python for Everyone (Basic introductory material, through to object oriented programming, interaction with web services, databases, plotting) https://www.py4e.com/lessons
- BE/Bi 103 a: Introduction to Data Analysis in the Biological Sciences (Caltech) - Data Science for Biology, with statistics and visualisation in Python
- Programming in the Biological Sciences Bootcamp notes (Caltech) - a comprehensive introduction to Python programming, with some Pandas, Numpy, Scipy and git thrown in for good measure.
- Magic methods, context managers (enter, exit for 'with' and more !): https://web.archive.org/web/20161024123835/http://www.rafekettler.com/magicmethods.html
- @property decorators, Descriptors: https://web.archive.org/web/20150407105027/http://intermediatepythonista.com/classes-and-objects-ii-descriptors
- @staticmethod, @classmethod and @abc.abstractmethod
- https://julien.danjou.info/blog/2013/guide-python-static-class-abstract-methods
- Metaclasses: http://eli.thegreenplace.net/2011/08/14/python-metaclasses-by-example and https://stackoverflow.com/questions/100003/what-is-a-metaclass-in-python
- Understanding scope, closures: https://www.farside.org.uk/201307/understanding_python_scope
- Introductory interactive visualization using Altair (University of Washington): https://uwdata.github.io/visualization-curriculum/intro.html
- PCA for Data Science: https://pca4ds.github.io/
- EMBL-ABR training videos: https://www.youtube.com/channel/UC5WlFNBSfmt3e8Js8o2fFqQ/videos
- Bioinformatics Workbook: https://bioinformaticsworkbook.org/#gsc.tab=0
- SequenceEng - resource of most seq applications and analysis pipeline: http://education.knoweng.org/sequenceng/
- Rosalind - problem solving exercises in computational biology to learn the fundamentals
- JHU Computational Genomics notebooks - in depth code examples to help understand how genome short read alignment and assembly works. Burrows-Wheeler Transforms and de Bruijn graphs.
- Data Carpentry Genomics Workshop - a great starting point for learning genomics on the commandline.
- Computational Genomics Tutorial - based on the Massey University Genome Science course taught by Sebastian Schmeier. A very clear tutorial series that covers installing and running tools for doing NGS read quality control, genome assembly and mapping, annotation, variant calling and interpretation. Light on theory, but a good starting point for working through the mechanics of genomics on the commandline.
- Sanger Pathogen Informatics Training - commandline tutorials covering various analysis on microbial pathogens. Structured as a series of notebooks to follow, starting here.
Some of these tutorials and guides start with raw FASTQ reads, through to differential expression analysis. Others begin with the counts matrix.
-
Sydney informatics hub RNAseq tutorial 2023 - starting from raw reads with nf-core/rnaseq, through to differential expression analysis with R & DESeq2, and functional enrichment.
-
Introduction to differential gene expression analysis using RNA-seq (Dündar, Skrabanek, Zumbo @ Cornell) - a very nice RNA-seq overview and tutorial, from RNA extraction and experimental design to differential gene expression analysis, with Unix commandline and R exercises.
-
RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR - Law et al, 2016 - a good practical tutorial for edgeR and limma, starting from a counts matrix.
-
A guide to creating design matrices for gene expression experiments
-
A big list of RNASeq links - nicely organized into sections like 'normalization' and 'batch effects': https://github.com/crazyhottommy/RNA-seq-analysis
-
http://master.bioconductor.org/help/course-materials/2015/Uruguay2015/V6-RNASeq.html
-
https://diytranscriptomics.com/ - largely video-based tutorial series for learning RNA-seq analysis using R.
-
Harvard Chan core RNA-seq beginner and Salmon+DESeq2 courses
-
Case study: using a Bioconductor R pipeline to analyze RNA-seq data
-
RNASeq tutorial from UOregon: https://github.com/griffithlab/rnaseq_tutorial/wiki
-
http://www.ngscourse.org/Course_Materials/alignment/tutorial/example.html
-
http://www.rnaseqforthenextgeneration.org/protocols/index.htm
-
Orchestrating Single-Cell Analysis with Bioconductor covers the Bioconductor way of doing single cell. But many people prefer to use Seurat instead. Seurat also has many useful vignettes.
-
Data processing and visualization for metagenomics - a Carpentries workshop in incubation. May have some rough edges, but it's already looking quite good.
-
Orchestrating Microbiome Analysis covers the Bioconductor way of doing microbiome analysis.
Papers that could be of interest for functional enrichment analysis.
- Null hypothesis in GSEA https://www.frontiersin.org/articles/10.3389/fgene.2020.00654/full
- Survey of ORA and FCS (including recommendations) http://ziemann-lab.net/public/kaumadi/manuscript.html
- Univariate and Mutivariate FCS - https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1790-4
- Multi contrast and multi-omics FCS- Mitch R package (https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-06856-9)