Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01765374506
Title: Data for Coarse-grained Intrinsically Disordered Proteins
Contributors: Webb, Michael
Patel, Roshan
Borca, Carlos
Keywords: Machine Learning
Protein
Molecular Simulation
Molecular Dynamics
Issue Date: 6-May-2022
Publisher: Princeton University
Abstract: This distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning" by Roshan A. Patel, Carlos H. Borca, and Michael A. Webb (DOI: 10.1039/D1ME00160D). The specific IDP sequences are sourced from version 9.0 of the DisProt database. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379.
URI: http://arks.princeton.edu/ark:/88435/dsp01765374506
https://doi.org/10.34770/chzn-mj42
Referenced By: https://doi.org/10.1039/D1ME00160D
Appears in Collections:Research Data Sets

Files in This Item:
File Description SizeFormat 
README2.67 kBTextView/Download
dataset_a_sequences.txt654.7 kBTextView/Download
dataset_a_encodings.csv113.97 kBCSVView/Download
dataset_a_labels.csv89.81 kBCSVView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.