Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01mp48sh02m
Title: Thermodynamic and Dynamics Data for Coarse-grained Intrinsically Disordered Proteins Generated by Active Learning
Contributors: Webb, Michael
Jacobs, William
An, Yaxin
Oliver, Wesley
Keywords: Intrinsically disordered proteins
protein condensates
phase separation
Issue Date: 6-Jun-2023
Publisher: Princeton University
Abstract: This distribution compiles thermodynamic and (where available) dynamic properties of short protein sequences as obtained from coarse-grained molecular dynamics simulations. The dataset features 2114 protein sequences with sequence lengths ranging from N=20 up to N=50 amino acids. The simulation and analysis of these sequences is described in "Active learning of the thermodynamics--dynamics tradeoff in protein condensates'' by Yaxin An, Michael A. Webb*, and William M. Jacobs* (https://doi.org/10.48550/arXiv.2306.03696). Of the 2114 protein sequences, 80 are homomeric polypeptides (replicating a single amino acid for N = 20, 30, 40, and 50), 1266 are sourced from version 9.0 of the DisProt database, and the remaining 768 sequences are novel sequences generated during an active learning campaign described in the aforementioned manuscript. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379. Properties included in this distribution include second virial coefficients, pressure-density data, expectation for phase behavior at 300 K, estimated condensed-phase densities at 300 K (if exist), and condensed-phase self-diffusion coefficients at 300 K (if exist).
URI: http://arks.princeton.edu/ark:/88435/dsp01mp48sh02m
https://doi.org/10.34770/6tnm-7b56
Appears in Collections:Research Data Sets

Files in This Item:
File Description SizeFormat 
README.txt8.94 kBTextView/Download
EOS_heteromeric.csv307.72 kBCSVView/Download
EOS_homomeric.csv15.64 kBCSVView/Download
features_heteromeric.csv164.27 kBCSVView/Download
features_homomeric.csv6.81 kBCSVView/Download
labels_heteromeric.csv105.86 kBCSVView/Download
labels_homomeric.csv4.75 kBCSVView/Download
seq_heteromeric.txt64.3 kBTextView/Download
seq_homomeric.txt2.81 kBTextView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.