You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.Rmd
+11-9Lines changed: 11 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: scglmmr *S*ample-level Single-*C*ell *G*eneralized *L*inear *M*ultilevel *M*odels in *R*
2
+
title: Sample-level Single Cell GLMMs in R
3
3
output: github_document
4
4
---
5
5
@@ -13,7 +13,9 @@ knitr::opts_chunk$set(
13
13
)
14
14
```
15
15
16
-
An R package for implementing mixed effects models on single cell data with complex experiment designs. The package is flexible and can accomodate many experiment designs. It was developed for analysis of multimodal single cell data from many individuals assayed pre and post perturbation such as drug treatment, where each individual is nested within one or more response groups. The methods herein allow one to compare the difference in perturbation response effects between groups while modeling variation in donor expression. It also has many wrappers for downstream enrichment testing and visualization.
16
+
**This package is under active development**
17
+
18
+
An R package for implementing mixed effects modeling methods on single cell data that can accomodate many different complex experiment designs. The package is built around [lme4](https://www.jstatsoft.org/article/view/v067i01) and was originally made for analysis of single cell data collected from many individuals who are assayed pre- and post- perturbation such as drug treatment, nested within one or more response groups. The methods herein allow one to compare the difference in perturbation response effects between groups while modeling variation in donor expression. It also has many wrappers for downstream enrichment testing and visualization.
17
19
18
20
Please see vignettes.
19
21
@@ -26,13 +28,11 @@ library(scglmmr)
26
28
<imgsrc="man/figures/scglmmr.overview.png" />
27
29
28
30
29
-
**With this type of experiment design, we can't just color umap plots and try to find the effects.** We need statistical models.
30
-
31
31
## Single cell within cluster perturbation response differential expression
32
32
33
-
The purpose of this software is to analyze single cell genomics data with pre and post perturbation measurements from the same individuals, including complex designs wehre individuals with repeated measurements are nested within in multiple response groups. The focus is on implementing flexible generalized linear multilevel models to derive group (i.e. good or poor clinical outcome, high or low rug response) and treatment associated effects *within cell types* defined either by protein (e.g. with CITE-seq data) or transcriptome based clustering followed by downstream enrichment testing and visualization.
33
+
The purpose of this software is to analyze single cell genomics data with pre and post perturbation measurements from the same individuals, including complex designs with many subjects, each subject having repeated measurements pre and post perturbation and each subject nested within in different groups, such as different end point response correlates, e.g. high and low responders. The focus is on implementing flexible generalized linear multilevel models to derive group (i.e. good or poor clinical outcome, high or low rug response) and treatment associated effects *within cell types* defined either by protein (e.g. with CITE-seq data) or transcriptome based clustering followed by downstream enrichment testing and visualization.
34
34
35
-
By default, the effect of treatment/perturbation across all subjects, the baseline differences between outcome groups, and the difference in the treatment effect between the outcome groups are tested. Any number of model covariates can be specified and by default the package uses a random intercept model to accomodate the non-independence of expression within each subject.
35
+
Any number of model covariates can be specified. The vignettes provide methods where a random intercept term for teh donor ID of each cell oraggregated library is incluided in the model. These methods thus model the vriation around the baseline expression across individuals, accomodating non-independence of expression for repeated timepoints from each subject.
36
36
37
37
An overview of methods provided:
38
38
@@ -46,13 +46,15 @@ Test perturbation effect using a gene level Poisson mixed model.
46
46
Test perturbation effects and differences in perturbation responses between groups at the gene module level.
47
47
48
48
### 4. Downstream enrichment testing and visualization
49
-
Wrappers around methods from the [fast set gene enrichment (fgsea)](https://www.biorxiv.org/content/10.1101/060012v2#disqus_thread)and [clusterProfiler](https://www.bioconductor.org/packages/release/bioc/html/clusterProfiler.html) R packages.
49
+
There are wrapper functions around multiple gene set enrichment methods, with emphasis on the [fast set gene enrichment (fgsea) package](https://www.biorxiv.org/content/10.1101/060012v2#disqus_thread). The results from fgsea can then be further interrogated by methods for contrasting information content in genes driving enrichments within and between celltypes. Multiple visualization wrappers are also provided.
50
50
51
51
### Philosophy
52
-
The scglmmr package considers each cluster/ cell type as a separate 'experiment' similar to fitting separate models to different FACS sorted leukocyte subsets followed by RNAseq. Fitting models separately to each subset provides maximum flexibility and avoids issues with e.g. modeling mean variance trends or count distributions for cell type specific genes in subsets that do not express the gene while still enabling comparison of, for example, coherent perturbation effects for the same gene across individuals between different cell clusters. This approach is particularly well suited for CITE-seq data with cells clustered based on normalized protein expression levels. Typically our workflow consists of denoising ADT data using our method [dsb](https://github.com/niaid/dsb) followed by modeling the group level perturbation response effects using scglmmr.
52
+
This package models expression within each cluster/ cell type independently in order to capture perturbation effects of cell type specific genes as well as genes that are expressed by multiple cell types. Using a normal distribution on count data requires first modeling the mean variance trend [(see Law et al)](https://genomebiology.biomedcentral.com/articles/10.1186/gb-2014-15-2-r29) this requires filtering features (genes) that are not expressed by a iven cell type. These cell type specific transcripts are therefore tested for perturbation effects within only in the cell types that express the genes, instead of across all cell types. Genes that are shared across cell types can be conpared for coherent perturbation effects across all subjects or between different groups of subjects using contrast coding.
53
+
54
+
This approach is particularly well suited for multimodal single cell data where cells are clustered based a independent information from the perturbation effects. For example, we have utilized methods in this package for CITE-seq data where we first denoise ADT data using our method [dsb](https://github.com/niaid/dsb) followed by modeling transcriptome not for differences between cell types, but for the group level perturbation response effects using this package.
53
55
54
56
**Experiment designs (within each cluster / celltype) supported by scglmmr**
55
-
Below is a 2 group repeated measures experiment. This data can be accomodated by scglmmr. More simple experiment designs are also supported, for example data with 2 groups but not repeated pre/post treatment measurements.
57
+
Below is a 2 group repeated measures experiment. This data can be accomodated by scglmmr. More simple experiment designs are also supported, for example 2 groups with one timepoint or more complex experiments for example 3 timepoints.
0 commit comments