Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01765374506
Full metadata record
DC FieldValueLanguage
dc.contributor.authorWebb, Michael-
dc.contributor.authorPatel, Roshan-
dc.contributor.authorBorca, Carlos-
dc.date.accessioned2022-05-06T12:52:21Z-
dc.date.available2022-05-06T12:52:21Z-
dc.date.created2021-11-03-
dc.date.issued2022-05-06-
dc.identifier.urihttp://arks.princeton.edu/ark:/88435/dsp01765374506-
dc.identifier.urihttps://doi.org/10.34770/chzn-mj42-
dc.description.abstractThis distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning" by Roshan A. Patel, Carlos H. Borca, and Michael A. Webb (DOI: 10.1039/D1ME00160D). The specific IDP sequences are sourced from version 9.0 of the DisProt database. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379.en_US
dc.description.sponsorshipThe generation of this data was supported by the National Science Foundation under DMREF Award Number NSF-DMR-2118861.en_US
dc.description.tableofcontentsREADME, dataset_a_sequences.txt, dataset_a_encodings.csv, dataset_a_labels.csven_US
dc.language.isoen_USen_US
dc.publisherPrinceton Universityen_US
dc.relation.isreferencedbyhttps://doi.org/10.1039/D1ME00160Den_US
dc.rightsCC BY NC ND 4.0 (https://creativecommons.org/licenses/by-nc/4.0/)en_US
dc.subjectMachine Learningen_US
dc.subjectProteinen_US
dc.subjectMolecular Simulationen_US
dc.subjectMolecular Dynamicsen_US
dc.titleData for Coarse-grained Intrinsically Disordered Proteinsen_US
dc.typeDataseten_US
Appears in Collections:Research Data Sets

Files in This Item:
File Description SizeFormat 
README2.67 kBTextView/Download
dataset_a_sequences.txt654.7 kBTextView/Download
dataset_a_encodings.csv113.97 kBCSVView/Download
dataset_a_labels.csv89.81 kBCSVView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.