A python3 package for operations on pedigree and genotype data, including simulation of genotype-phenotype associations under quantitative genetic models
Requires: Python 3.4+, numpy, scipy, pandas, cython
In addition to basic pedigree data manipulation, pydigree also includes submodules for more complicated tasks:
- simulation: Provides classes for simulating genetic data
- stats: Classes and functions for statistical genetics
- mixedmodel: Provides classes for using mixed models with family data
- io: Provides functions for importing/exporting data from common data formats, including:
plink: Functions for working with plink format PED/MAP datavcf: Functions for working with the VCF genotype formatgenomesimla: Includes a function for reading genomeSIMLA format chromosome templates- sgs: Functions for shared genomic segment data
Invidiual: Models an individual with pedigree and phenotype dataPopulation: Models groups of Individuals with a common genetic backgroundPedigree: A special case of Population for related individuals. Implements kinship/inbreeding functionsPedigreeCollection: A container class handling multiple pedigreesChromosomeTemplate: Models a chromosome with information on allele frequency and marker positionChromosomeSet: The set ofChromosomeTemplates for a populationAlleles: Stores a haploid set of allelesSparseAlleles: Stores a haploid set of alleles as differences from a referenceLabelledAlleles: An efficient container for storing references to a founder chromosomeMixedModel: A class for fitting mixed-effect models with related individualsMLEResult: A class containing the maximum likelihood estimates of parameters and values pertaining to the likelihood function at the MLEArchitecture: A class describing the genetic architecture for a trait to be used in simulationGeneDroppingSimulation: A base class from which other gene-drop simulation objects inheritNaiveGeneDroppingSimulation: Simulates genetic data for pedigrees by random gene droppingConstrainedMendelianSimulation: Simulates genetic data for pedigrees from a prespecified inheritance structureSGSAnalysis: A class containing the result of a shared genomic segment (SGS) analysisSGS: A class containing the segments shared between a pair of individualsSegment: A class describing the location of a shared segment between a pair of individuals
IterationError: Raised when at iterative algorithm exceeds the maximum allowed number of iterationsNotMeaningfulError: Raised when a comparison does not make sense (e.g. is one genotype greater than the other)SimulationError: Raised when an error occurs in a simulationFileFormatError: Raised when an input file can't be parsed successfully
Pydigree includes a few useful scripts for dealing with pedigree data including:
simulate_pedigree_data.py: Simulates data from template pedigreesbitsize.py: Calculates bit sizes for each pedigreekinship.py: Caluclates inbreeding coefficients and pairwise kinship coefficients for pedigreesgenedrop.py: Performs gene dropping simulations to approximate the actual probability of an IBD configurationpolygenic.py: Calculates variance components for normally distributed continuous traits.
J.E. Hicks (2017) Pydigree: a python module for manipulation and simulation and of genetic datasets. biorxiv preprint doi:10.1101/213413
- Charles II of Spain: http://en.wikipedia.org/wiki/Charles_II_of_Spain