Skip to content

[Bioinformatics] PyCoM: a python library for large-scale analysis of residue–residue coevolution data. https://doi.org/10.1093/bioinformatics/btae166

License

Notifications You must be signed in to change notification settings

cemiu/pycom_generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyCoM Generator

Pipeline for creating the PyCoM database; see:

tldr: For UniProtKB proteins:

  • Filter to manage size (Swiss-prot only; seq length ≤ 500)
  • Extract annocations (mjolnir db command)
  • Pipeline (mjolnir process command):
    • Sequence Alignment with HH-Suite (HH-Blits => HH-Filter)
    • Predict Protein Residue-Residue Contacts / Coevolution matrices with CCMpred

This code was written for the creation of the PyCoM database. It was written to run on the Jade2 HPC at Oxford and the Young HPC at UCL.

The resulting project/database, PyCoM, can be found on https://pycom.brunel.ac.uk.

This git repo is also of interest, as it contains the library for interacting with the database created by this: https://github.com/cemiu/pycom.

My work on this project was funded by the Department of Computer Science, Brunel University London.

DATABASE DOWNLOAD

https://pycom.brunel.ac.uk/

Info

Pipeline tools

Uniclust (for hh-suite)

Sequence alignment

Coevolution matrix

About

[Bioinformatics] PyCoM: a python library for large-scale analysis of residue–residue coevolution data. https://doi.org/10.1093/bioinformatics/btae166

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published