PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Protein Sequences Python Packages

Python packages with the GitHub topic protein-sequences. Sorted by relevance, with stars and monthly downloads.
amckenna41
protpy

Calculating a range of protein descriptors using their physicochemical, biological and structural properties 🔬.

18K 17 1
williamgilpin
pypdb

A Python API for the RCSB Protein Data Bank (PDB)

5K 335 76
bioinf-MCB
mdeepfri

Pipeline for searching and aligning contact maps for proteins, then running DeepFri's GCN.

2K 45 7
songlab-cal
tape-proteins

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

2K 739 135
AuReMe
esmecata

From taxonomic affiliations to annotated consensus proteomes using UniProt database.

1K 8 0
sacdallago
bio-embeddings

Get protein embeddings from protein sequences

1K 508 70
lucidrains
protein-bert-pytorch

Implementation of ProteinBERT in Pytorch

1K 164 20
HobnobMancer
cazy-webscraper

Web scraper to retrieve protein data catalogued by the CAZy, UniProt, NCBI, GTDB and PDB websites/databases.

997 18 2
songlab-cal
bio-embeddings-tape-proteins

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.

893 739 135
niklases
pypef

PyPEF – Pythonic Protein Engineering Framework

731 14 2
dohlee
abyssal-pytorch

Abyssal - Pytorch

602 6 2
univieCUBE
deepnog

Protein orthologous group assignment with deep learning

523 30 8
graph-part
graph-part

Graph-based partitioning of biological sequence data

507 35 6
microsoft
evodiff

Python package for generation of protein sequences and evolutionary alignments via discrete diffusion models

495 668 110
Sanpme66
protpeptigram

Visualization of Immunopeptides Mapped to Source Proteins Across Multiple Samples

485 0 0
joelb123
rafm

Reliable AlphaFold Measures

483 2 0
dohlee
tranception-pytorch-dohlee

Implementation of Tranception, a SOTA transformer model for protein fitness prediction, in PyTorch.

389 3 0
michaelscutari
protclust

protclust is a Python library for protein sequence analysis that integrates MMseqs2 for fast clustering and provides tools for creating robust machine learning datasets. It offers cluster-aware data splitting to prevent sequence similarity bias in model evaluation, along with comprehensive protein embedding capabilities for feature generation.

389 4 0
sbl-sdsc
mmtfpyspark

Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.

385 68 27
kklemon
protenc

Extract protein embeddings the easy way.

354 10 1
dohlee
antiberty-pytorch

An unofficial re-implementation of AntiBERTy, an antibody-specific protein language model, in PyTorch.

338 26 5
sacdallago
bio-embeddings-duongttr

Get protein embeddings from protein sequences

257 508 70
kyegomez
progen-torch

Paper - Pytorch

233 11 0
aszymik
jamsfetch

Package for automatic data download from major bioinformatics databases

203 3 1
    • Data from PyPI, GitHub, ClickHouse, and BigQuery