PyCoM: A Protein Coevolution Database

PyCoM provides researchers and bioinformaticians with a database of 457,622 annotated proteins and Coevolution Matrices alongside an intuitive Python API, a comprehensive library with tools for analysis, and REST API. sourced from UniProtKB/Swiss-Prot and processed with HH-suite3 and CCMpred. PyCoM simplifies the complex task of protein coevolution analysis. Additionally we host the PyCoM Alignment Repository, containing pre-computed for most proteins in SwissProt.

Why PyCoM?

  • Ease of Use: Rapidly query, extract, and visualize protein annotation and coevolution data!

  • Comprehensive Tutorials: Quickly get started with detailed tutorials to maximise your research impact.

  • Enables Large Scale Analysis of coevolution data taking 35 CPU-core years and 1 GPU year to compute

Installation Made Easy:

Begin exploring immediately:

pip3 install git+https://github.com/scdantu/pycom

Example Usage:

Effortlessly query proteins linked to specific conditions, visualize coevolution matrices, and perform sophisticated analyses:

from pycom import PyCom, CoMAnalysis
import matplotlib.pyplot as plt

pyc = PyCom(remote=True)
prots = pyc.find(
    min_length=200, max_length=210,
    disease='cancer', has_substrate=True,
    matrix=True, page=1
)

CoMAnalysis().add_contact_predictions(prots)

plt.axis('off')
plt.title(f'Contact Map for uniprot_id={prots.uniprot_id[0]}')
plt.imshow(prots.contact_matrix[0])
plt.show()

print(prots.iloc[0])

Example Output:

Output of the code above
uniprot_id           P62070
neff                 12.754
sequence_length      204
sequence             MAAAGWRDGSGQEK...
organism_id          9606
helix_frac           0.29902
turn_frac            0.019608
strand_frac          0.220588
has_ptm              1
has_pdb              1
has_substrate        1
matrix               [[0.0, 0.268, ...
contact_matrix       [[0.0, 0.0,   ...
Name: 0, dtype: object

(If the image is not displaying, click here.)

Key Features

Alignment generation parameters via HH-suite3 are detailed in Kamisetty et al. 2013.

How to Cite PyCoM

Please cite the following if PyCoM supports your research:

Harvard-style citation:

Glass, P.E., Alibai, S., Pandini, A. & Dantu, S.C., 2024. PyCoM: a Python library for large-scale analysis of residue–residue coevolution data. Bioinformatics, 40(4), p.btae166. https://doi.org/10.1093/bioinformatics/btae166

BibTeX:

@article{glass2024pycom,
    author = {Glass, Philipp E and Alibai, Sabriyeh and Pandini, Alessandro and Dantu, Sarath Chandra},
    title = "{PyCoM: a python library for large-scale analysis of residue–residue coevolution data}",
    journal = {Bioinformatics},
    volume = {40},
    number = {4},
    pages = {btae166},
    year = {2024},
    url = {https://doi.org/10.1093/bioinformatics/btae166},
}

Our Team

Brunel University London, UK

Indices and tables