Skip to content

Commit 9081089

Browse files
committed
Update version to 2.1.1 and enhance scale_abundance function for large datasets
- Bumped package version to 2.1.1. - Introduced memory-efficient chunked processing in scale_abundance with the new chunk_sample_size parameter. - Enabled parallel computation support via BiocParallel, improving performance for large-scale analyses. - Ensured identical results between chunked and non-chunked processing for reproducibility. - Updated documentation with usage examples and key improvements for better user guidance.
1 parent 6619e69 commit 9081089

File tree

1 file changed

+85
-62
lines changed

1 file changed

+85
-62
lines changed

inst/NEWS.rd

Lines changed: 85 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -1,71 +1,30 @@
11
\name{NEWS}
22
\title{News for Package \pkg{tidybulk}}
33

4-
\section{Changes in version 1.2.0, Bioconductor 3.12 Release}{
5-
\itemize{
6-
\item Make gene filtering functionality `identify_abundance` explicit, a warning will be given if this has not been performed before the majority of workflow steps (e.g. `test_differential_abundance`).
7-
\item Add Automatic bibliography `get_bibliography`.
8-
\item Add DESeq2 and limma-voom to the methods for `test_differential_abundance` (method="DESeq2").
9-
\item Add prefix to test_differential_abundance for multi-methods analyses.
10-
\item Add other cell-type signature to `deconvolve_cellularity`.
11-
\item Add differential cellularity analyses `test_differential_cellularity`.
12-
\item Add gene descrption annotation utility `describe_transcript`.
13-
\item Add `nest` functionality for functional-coding transcriptomic analyses.
14-
\item Add gene overrepresentation functionality `test_gene_overrepresentation`.
15-
\item Add github website.
16-
\item Seep up data frame vadidation.
17-
\item Several bug fixes.
18-
}}
19-
20-
\section{Changes in version 1.3.2, Bioconductor 3.13 Release}{
4+
\section{Changes in version 2.1.1, Bioconductor 3.23 Devel}{
215
\itemize{
22-
\item Tidybulk now operates natively with SummarizedExperment data container, in a seamless way thanks to tidySummarisedExperiment 10.18129/B9.bioc.tidySummarizedExperiment
23-
\item Added robust edgeR as it outperforms many other methods as shown here doi.org/10.1093/nargab/lqab005
24-
\item Added test stratifiction cellularity, to easily calculate Kaplan-Meier curves
25-
\item Production of SummarizedExperiment from BAM or SAM files
26-
\item Added treat method to edgeR and voom differential transcription tests doi.org/10.1093/bioinformatics/btp053
27-
\item Added the method as_SummarizedExperiment
28-
\item Vastly improved test_gene_enrichment
29-
\item Added test_gene_rank, based on GSEA
30-
\item Several bug fixes.
6+
\item \strong{Major enhancement to scale_abundance for large-scale and HDF5-backed datasets:} Added memory-efficient chunked processing with parallel computation support via the new \code{chunk_sample_size} parameter. This breakthrough enables TMM normalization of massive datasets (millions of cells, thousands of samples) that previously exceeded memory limits.
7+
8+
\item \strong{Key improvements:}
9+
\itemize{
10+
\item \strong{Memory efficiency:} Process datasets in sample chunks to dramatically reduce memory footprint, enabling analysis of HDF5-backed SummarizedExperiment objects without loading entire matrices into RAM
11+
\item \strong{Parallel processing:} Leverage BiocParallel for multi-core chunk processing with automatic progress tracking and informative messages about parallelization status
12+
\item \strong{Identical results:} Chunked and non-chunked processing produce mathematically identical scaled values when using the same reference sample, ensuring reproducibility
13+
\item \strong{DelayedArray preservation:} Automatically detects and preserves DelayedArray format using efficient sweep operations, maintaining memory benefits throughout the pipeline
14+
\item \strong{Backward compatible:} Default behavior unchanged (\code{chunk_sample_size = Inf}); existing code continues to work without modification
15+
}
16+
17+
\item \strong{Usage examples:}
18+
\itemize{
19+
\item Standard usage (no chunking): \code{se |> scale_abundance()}
20+
\item Memory-efficient chunking: \code{se |> scale_abundance(chunk_sample_size = 50)}
21+
\item With parallel processing: \code{BiocParallel::register(BiocParallel::MulticoreParam(workers = 8)); se |> scale_abundance(chunk_sample_size = 200)}
22+
}
23+
24+
\item \strong{Performance potential:} Enables analysis of previously intractable datasets, with linear memory scaling and near-linear speedup with additional CPU cores. Particularly beneficial for single-cell pseudobulk analyses, large cohort studies, and cloud computing environments with memory constraints.
3125
}}
3226

33-
\section{Changes in version 1.5.5, Bioconductor 3.14 Release}{
34-
\itemize{
35-
\item Added user-defined gene set for gene rank test
36-
\item Sped up aggregate_transcripts for large scale tibbles or SummarizedExperiment objects
37-
\item Allow passing additional arguments to DESeq2 method in test_differential_abundance
38-
\item Allow scale_abundance to run with a user-defined subset of genes (e.g. housekeeping genes)
39-
\item Add UMAP to reduce_dimensions()
40-
\item Several minor fixes, optimisations and documentation improvements
41-
}}
42-
43-
\section{Changes in version 1.7.3, Bioconductor 3.15 Release}{
44-
\itemize{
45-
\item Improve imputation and other features for sparse counts
46-
\item Cibersort deconvolution, check 0 counts
47-
\item Improve missing abundance with force scaling
48-
\item Other small fixes and messaging
49-
}}
50-
51-
\section{Changes in version 1.7.4, Bioconductor 3.16 Dev}{
52-
\itemize{
53-
\item Improved deconvolution robustness for SummarizedExperiment, edge cases
54-
\item Allow mapping of tidybulk_SAM_BAM to non-human genomes
55-
\item Adopt the vocabulary .feature, .sample, for conversion between SummarizedExperiment and tibble, similarly to tidySummarizedExperiment
56-
\item Deprecate .contrasts argument if favour of contrasts (with no dot)
57-
\item Make aggregate_duplicates more robust for tibble and SummarizedExperiment inputs
58-
\item Deprecate log_tranform argument for all methods for a more generic tranform argument that accepts arbitrary functions
59-
}}
60-
61-
\section{Changes in version 1.9.2, Bioconductor 3.16 Dev}{
62-
\itemize{
63-
\item Improve aggregate_duplicates for tibble and SummarizedExperiment
64-
\item Fix epic deconvolution when using DelayedMatrix
65-
\item Allow as_SummarizedExperiment with multiple columns identifiers for .sample and .feature
66-
}}
67-
68-
\section{Changes in version 2.0.0, Bioconductor 3.19 Release}{
27+
\section{Changes in version 2.0.0, Bioconductor 3.22 Release}{
6928
\itemize{
7029
\item Major refactoring to improve code maintainability and performance. This included the removal of all tbl methods in favor of SummarizedExperiment-based approaches.
7130
\item Replace deprecated pipe operator \%>\% with native |> operator for improved readability
@@ -103,3 +62,67 @@
10362
\item Remove deprecated warnings and redundant messages
10463
\item Several bug fixes and optimizations
10564
}}
65+
66+
\section{Changes in version 1.9.2, Bioconductor 3.16 Dev}{
67+
\itemize{
68+
\item Improve aggregate_duplicates for tibble and SummarizedExperiment
69+
\item Fix epic deconvolution when using DelayedMatrix
70+
\item Allow as_SummarizedExperiment with multiple columns identifiers for .sample and .feature
71+
}}
72+
73+
\section{Changes in version 1.7.4, Bioconductor 3.16 Dev}{
74+
\itemize{
75+
\item Improved deconvolution robustness for SummarizedExperiment, edge cases
76+
\item Allow mapping of tidybulk_SAM_BAM to non-human genomes
77+
\item Adopt the vocabulary .feature, .sample, for conversion between SummarizedExperiment and tibble, similarly to tidySummarizedExperiment
78+
\item Deprecate .contrasts argument if favour of contrasts (with no dot)
79+
\item Make aggregate_duplicates more robust for tibble and SummarizedExperiment inputs
80+
\item Deprecate log_tranform argument for all methods for a more generic tranform argument that accepts arbitrary functions
81+
}}
82+
83+
\section{Changes in version 1.7.3, Bioconductor 3.15 Release}{
84+
\itemize{
85+
\item Improve imputation and other features for sparse counts
86+
\item Cibersort deconvolution, check 0 counts
87+
\item Improve missing abundance with force scaling
88+
\item Other small fixes and messaging
89+
}}
90+
91+
\section{Changes in version 1.5.5, Bioconductor 3.14 Release}{
92+
\itemize{
93+
\item Added user-defined gene set for gene rank test
94+
\item Sped up aggregate_transcripts for large scale tibbles or SummarizedExperiment objects
95+
\item Allow passing additional arguments to DESeq2 method in test_differential_abundance
96+
\item Allow scale_abundance to run with a user-defined subset of genes (e.g. housekeeping genes)
97+
\item Add UMAP to reduce_dimensions()
98+
\item Several minor fixes, optimisations and documentation improvements
99+
}}
100+
101+
\section{Changes in version 1.3.2, Bioconductor 3.13 Release}{
102+
\itemize{
103+
\item Tidybulk now operates natively with SummarizedExperment data container, in a seamless way thanks to tidySummarisedExperiment 10.18129/B9.bioc.tidySummarizedExperiment
104+
\item Added robust edgeR as it outperforms many other methods as shown here doi.org/10.1093/nargab/lqab005
105+
\item Added test stratifiction cellularity, to easily calculate Kaplan-Meier curves
106+
\item Production of SummarizedExperiment from BAM or SAM files
107+
\item Added treat method to edgeR and voom differential transcription tests doi.org/10.1093/bioinformatics/btp053
108+
\item Added the method as_SummarizedExperiment
109+
\item Vastly improved test_gene_enrichment
110+
\item Added test_gene_rank, based on GSEA
111+
\item Several bug fixes.
112+
}}
113+
114+
\section{Changes in version 1.2.0, Bioconductor 3.12 Release}{
115+
\itemize{
116+
\item Make gene filtering functionality `identify_abundance` explicit, a warning will be given if this has not been performed before the majority of workflow steps (e.g. `test_differential_abundance`).
117+
\item Add Automatic bibliography `get_bibliography`.
118+
\item Add DESeq2 and limma-voom to the methods for `test_differential_abundance` (method="DESeq2").
119+
\item Add prefix to test_differential_abundance for multi-methods analyses.
120+
\item Add other cell-type signature to `deconvolve_cellularity`.
121+
\item Add differential cellularity analyses `test_differential_cellularity`.
122+
\item Add gene descrption annotation utility `describe_transcript`.
123+
\item Add `nest` functionality for functional-coding transcriptomic analyses.
124+
\item Add gene overrepresentation functionality `test_gene_overrepresentation`.
125+
\item Add github website.
126+
\item Seep up data frame vadidation.
127+
\item Several bug fixes.
128+
}}

0 commit comments

Comments
 (0)