TAD Consensus Analysis

Overview

This R package provides tools for generating consensus Topologically Associating Domains (TADs) from multiple prediction methods. TADs are fundamental units of chromatin organization that play crucial roles in gene regulation. While multiple computational tools exist to predict TAD boundaries from Hi-C data, their results often vary significantly. This package implements methods to integrate predictions from multiple tools and generate high-confidence consensus TAD sets.

Installation

# Install from GitHub
devtools::install_github("CSOgroup/consensusTADs", build_vignettes = TRUE)

Key Features

Generate consensus TADs from multiple prediction tools
Calculate Measure of Concordance (MoC) between TAD predictions
Select optimal non-overlapping TAD sets using dynamic programming
Apply iterative threshold approach for consensus building

Main Functions

`generate_tad_consensus()`

Creates consensus TADs through an iterative threshold approach that selects optimal non-overlapping TADs representing agreement across different prediction methods.

consensus_tads <- generate_tad_consensus(
  df_tools,      # Data frame with TAD predictions
  threshold = 0, # Minimum MoC threshold
  step = -0.05   # Step size for threshold iteration
)

`generate_tad_consensus_hierarchy()`

Generates hierarchical consensus TADs through multiple rounds of iteration. In each round, it identifies consensus TADs and removes partially overlapping regions from the input data for the next round.

hierarchical_tads <- generate_tad_consensus_hierarchy(
  df_tools,              # Data frame with TAD predictions
  threshold = 0,         # Minimum MoC threshold
  step = -0.05,          # Step size for threshold iteration
  max_round = NULL,        # Maximum number of rounds
  consider_level = TRUE
)

`moc_score_filter()`

Calculates the Measure of Concordance (MoC) between TAD predictions and filters significant overlaps based on a threshold.

`select_global_optimal_tads()`

Implements a dynamic programming algorithm to select a set of non-overlapping TADs that maximize the total MoC score.

Example Usage

# Prepare input data with predictions from multiple tools
tad_data <- data.frame(
  chr = rep("chr1", 6),
  start = c(10000, 20000, 50000, 12000, 22000, 48000),
  end = c(30000, 45000, 65000, 32000, 43000, 67000),
  meta.tool = c(rep("tool1", 3), rep("tool2", 3))
)

# Generate consensus TADs with default parameters
library(consensusTADs)
consensus_results <- generate_tad_consensus(tad_data)
print(consensus_results)

# Generate consensus TADs with custom threshold values
custom_consensus <- generate_tad_consensus(
  tad_data,
  threshold = 0.3,
  step = -0.1
)


# Enable parallel processing for large datasets
options(future.globals.maxSize = 10 * 1024^3)
future::plan(future::multisession(workers = 4))

# Work with tool levels
tad_data_with_level <- data.frame(
  chr = rep("chr1", 8),
  start = c(10000, 15000, 20000, 50000, 55000, 15000, 50000, 80000),
  end = c(30000, 35000, 45000, 70000, 75000, 35000, 70000, 100000),
  meta.tool = c("tool1", "tool1", "tool2", "tool3", "tool3", "tool2", "tool1", "tool4"),
  meta.tool_level = c("L1", "L2", NA, "L1", "L2", NA, "L2", NA)
)

result_hierarchy <- generate_tad_consensus_hierarchy(
  tad_data_with_level,
  max_round = NULL,
  consider_level = TRUE
)

# Work without tool levels
tad_data_with_level <- data.frame(
  chr = rep("chr1", 8),
  start = c(10000, 15000, 20000, 50000, 55000, 15000, 50000, 80000),
  end = c(30000, 35000, 45000, 70000, 75000, 35000, 70000, 100000),
  meta.tool = c("tool1", "tool1", "tool2", "tool3", "tool3", "tool2", "tool1", "tool4")
)

result_hierarchy <- generate_tad_consensus_hierarchy(
  tad_data_with_level,
  max_round = NULL
)

future::plan(future::sequential)

How It Works

The consensus generation process follows these steps:

Input validation: Check if the input contains data from multiple prediction tools
Data preparation: Split the input data by chromosome
Threshold sequence generation: Create a sequence of threshold values
Iterative TAD selection: For each chromosome and threshold, calculate MoC scores and select optimal TADs
Result compilation: Combine results from all chromosomes

The Measure of Concordance (MoC) Score

The MoC score quantifies the agreement between two TAD predictions:

MoC = (intersection_width)² / (width1 × width2)

Where:

intersection_width is the length of the overlap between two TADs
width1 and width2 are the lengths of the two TADs being compared

Dependencies

dplyr
GenomicRanges
IRanges
tibble
purrr
tidyr
stringr
magrittr

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
R		R
man		man
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
consensusTADs.Rproj		consensusTADs.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TAD Consensus Analysis

Overview

Installation

Key Features

Main Functions

`generate_tad_consensus()`

`generate_tad_consensus_hierarchy()`

`moc_score_filter()`

`select_global_optimal_tads()`

Example Usage

How It Works

The Measure of Concordance (MoC) Score

Dependencies

About

Uh oh!

Releases

Packages

Languages

License

CSOgroup/consensusTADs

Folders and files

Latest commit

History

Repository files navigation

TAD Consensus Analysis

Overview

Installation

Key Features

Main Functions

generate_tad_consensus()

generate_tad_consensus_hierarchy()

moc_score_filter()

select_global_optimal_tads()

Example Usage

How It Works

The Measure of Concordance (MoC) Score

Dependencies

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`generate_tad_consensus()`

`generate_tad_consensus_hierarchy()`

`moc_score_filter()`

`select_global_optimal_tads()`

Packages