Skip navigation
Please use this identifier to cite or link to this item:
Title: Data for Coarse-grained Intrinsically Disordered Proteins
Contributors: Webb, Michael
Patel, Roshan
Borca, Carlos
Keywords: Machine Learning
Molecular Simulation
Molecular Dynamics
Issue Date: 6-May-2022
Publisher: Princeton University
Abstract: This distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning" by Roshan A. Patel, Carlos H. Borca, and Michael A. Webb (DOI: 10.1039/D1ME00160D). The specific IDP sequences are sourced from version 9.0 of the DisProt database. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379.
Referenced By:
Appears in Collections:Research Data Sets

Files in This Item:
File Description SizeFormat 
README2.67 kBTextView/Download
dataset_a_sequences.txt654.7 kBTextView/Download
dataset_a_encodings.csv113.97 kBCSVView/Download
dataset_a_labels.csv89.81 kBCSVView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.