-
Notifications
You must be signed in to change notification settings - Fork 18
Description
The ggalign package has been very beneficial for my actual work, but there's a confusion that I haven't found a good solution for.
In bioinformatics analysis, there is a type of heatmap. Its main body uses relative abundance for representation, while the side annotations display the absolute abundance of each gene to express richer gene abundance information. My question is: After clustering using align_dendro(), the column order of the main heatmap changes accordingly. At this point, how can one conveniently extract the changed column order and apply it to the absolute abundance plot?
First, the ComplexHeatmap package provides simple parameters to achieve this purpose, but I am not familiar with base R graphics grammar:
library(ComplexHeatmap)
set.seed(100)
expr_matrix <- matrix(sample(c(0,1), 400, replace = TRUE),
nrow = 20, ncol = 20)
rownames(expr_matrix) <- paste0("Gene", 1:20)
colnames(expr_matrix) <- paste0("Sample", 1:20)
# Calculate z-score
zscore_matrix <- t(scale(t(expr_matrix)))
# Calculate total expression per row
expr_counts <- rowSums(expr_matrix)
# Create heatmap
zscore_matrix |>
Heatmap(
name = " ",
rect_gp = gpar(col = "white", lwd = 2),
row_names_gp = gpar(fontsize = 8),
column_names_rot = 45,
column_names_gp = gpar(fontsize = 8),
cluster_rows = TRUE,
cluster_columns = TRUE,
show_row_dend = TRUE,
show_column_dend = TRUE,
right_annotation = rowAnnotation(
ExprCount = anno_barplot(
expr_counts,
bar_width = 0.8,
gp = gpar(fill = "steelblue"),
axis_param = list(
gp = gpar(fontsize = 6)
)
)
)
) -> htSecondly, I wrote my own set of code to apply to the ggalign workflow:
library(tidyverse)
library(ggalign)
get_gene_order <- function(
zscore_matrix,
method = "complete",
distance = "euclidean"
) {
dist_mat <- dist(zscore_matrix, method = distance)
hc_row <- hclust(dist_mat, method = method)
hc_row$labels[hc_row$order]
}
create_row_barplot <- function(
expr_matrix,
gene_order,
fill_color = "steelblue"
) {
Gene <- rownames(expr_matrix)
Count <- rowSums(expr_matrix)
tibble::tibble(Gene, Count) |>
mutate(Gene = forcats::fct_relevel(Gene, gene_order)) |>
ggplot(aes(x = Count, y = Gene)) +
geom_bar(stat = "identity", fill = fill_color, color = "black") +
theme_minimal() +
scale_x_continuous(labels = scales::label_number(accuracy = 1)) +
theme(
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.text.x = element_text(angle = 90, hjust = 1),
axis.title = element_blank(),
panel.grid = element_blank(),
panel.border = element_rect(color = "black")
)
}
ggene_heatmap <- function(
expr_matrix,
bar_fill = "steelblue",
cluster_palette_row = "Set1",
cluster_palette_col = "Dark2",
dendro_size = 0.2,
tile_color = "black",
tile_width = 0.9,
tile_height = 0.9,
axis_text_x_angle = 45,
cluster_method = "complete",
cluster_distance = "euclidean"
) {
# Step 1: Calculate z-score
zscore_matrix <- t(scale(t(expr_matrix)))
# Step 2: Get row clustering order
gene_order <- get_gene_order(
zscore_matrix,
method = cluster_method,
distance = cluster_distance
)
# Step 3: Draw right-side barplot
p_bar <- create_row_barplot(expr_matrix, gene_order, fill_color = bar_fill)
# Step 4: Assemble ggalign heatmap
zscore_matrix |>
ggheatmap(filling = NULL) +
geom_tile(
aes(fill = value),
color = tile_color,
width = tile_width,
height = tile_height
) +
scale_fill_viridis_c(name = "") +
theme_minimal() +
theme(
axis.text.x = element_text(angle = axis_text_x_angle, hjust = 1),
axis.title = element_blank()
) +
# Top dendro
anno_top(size = dendro_size) +
align_dendro() +
scale_color_brewer(palette = cluster_palette_col, guide = "none") +
theme(
axis.title = element_blank(),
axis.ticks = element_blank(),
axis.text = element_blank()
) +
# Left dendro
anno_left(size = dendro_size) +
align_dendro() +
scale_color_brewer(palette = cluster_palette_row, guide = "none") +
theme(
axis.title = element_blank(),
axis.ticks = element_blank(),
axis.text = element_blank()
) +
# Right bar chart
quad_switch(position = "right", size = dendro_size) +
p_bar
}
ggene_heatmap(expr_matrix)Although the above code can run, I believe this is not the best solution for the ggalign workflow. Therefore, I sincerely ask if you have a method to achieve either of the following goals:
- Extract the row/column order of the heatmap after clustering.
- Precisely align the row/column order of two plots when their axis labels are completely identical.
I look forward to your reply!