How does IdentifiHR work?

The IdentifiHR R package has several functions to support use, and can be used to predict HR status in a single sample, or across several samples. The model requires only a matrix or data frame of raw gene expression counts, with genes annotated with ensembl, hgnc or entrez identifiers.

The processCounts() function subsets the input matrix to only the genes required for predicition. It subsets counts to the 2604 genes required for normalisation and then transforms counts with log2 counts-per-million (CPM) to normalise for library size differences. Genes are then scaled using a z-score, whereby the mean and standard deviation are taken from our training dataset. As the mean and standard deviation are taken from our training cohort, this scaling must be performed by the processCounts() fucntion, and not be an alternate z-score function written in R.

Processed counts can then be used by the predictHr() function to infer HR status from the expression of only 209 genes.

The output of IdentifiHR is a data frame containing both a discrete prediction of HR status, being HR deficient ("HRD") OR HR proficient ("HRP"), in addition to the probability that a sample is HRD.

Package overview:

identifiHRPackageOverview

Please find the IdentifiHR manuscript here.

IdentifiHR is a predictive model of HR status in HGSC that uses only gene expression.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How does IdentifiHR work?

Package overview:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally