Please use this identifier to cite or link to this item:
http://arks.princeton.edu/ark:/88435/dsp01765374506
Title: | Data for Coarse-grained Intrinsically Disordered Proteins |
Contributors: | Webb, Michael Patel, Roshan Borca, Carlos |
Keywords: | Machine Learning Protein Molecular Simulation Molecular Dynamics |
Issue Date: | 6-May-2022 |
Publisher: | Princeton University |
Abstract: | This distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning" by Roshan A. Patel, Carlos H. Borca, and Michael A. Webb (DOI: 10.1039/D1ME00160D). The specific IDP sequences are sourced from version 9.0 of the DisProt database. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379. |
URI: | http://arks.princeton.edu/ark:/88435/dsp01765374506 https://doi.org/10.34770/chzn-mj42 |
Referenced By: | https://doi.org/10.1039/D1ME00160D |
Appears in Collections: | Research Data Sets |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
README | 2.67 kB | Text | View/Download | |
dataset_a_sequences.txt | 654.7 kB | Text | View/Download | |
dataset_a_encodings.csv | 113.97 kB | CSV | View/Download | |
dataset_a_labels.csv | 89.81 kB | CSV | View/Download |
Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.