Skip navigation
Please use this identifier to cite or link to this item:
Title: Probability, Entropy, and Adaptive Immune System Repertoires
Authors: Sethna, Zachary Michael
Advisors: Callan, Curtis G
Contributors: Physics Department
Keywords: Adaptive Immune System
B cells
Information Theory
T cells
Subjects: Biophysics
Statistical physics
Issue Date: 2018
Publisher: Princeton, NJ : Princeton University
Abstract: The adaptive immune system, composed of white blood cells called lymphocytes (B and T cells) that circulate in the lymph and blood, is a precision tool that tags and removes foreign peptides. Such peptides, also called antigens or epitopes, are identified by a specific binding to elements of a library or repertoire of unique proteins called receptors (e.g. antibodies or T cell receptors). A repertoire must be large and diverse enough so that at least one receptor will be able to recognize any pathogen epitope the organism is likely to encounter. This diversity is achieved by stochastic rearrangement of the germline DNA to create novel complementarity determining region sequences (CDR3) in a process called called V(D)J recombination. In this thesis we utilize previously developed generative models of V(D)J recombi- nation events, and infer the model parameters from large datasets of DNA sequences. The generation probability (Pgen) of a nucleotide or amino acid CDR3 is the sum of all model probabilities of V(D)J recombination events that generate the sequence. While previously it was only feasible to compute Pgen of nucleotide sequences, we introduce a novel dynamic programming algorithm that efficiently computes Pgen of amino acid sequences. We use this Pgen for several applications. First we examine how the diversity of a repertoire, characterized by the model entropy, scales with the number of insertions in the V(D)J process. This is used to describe the maturation of the T cell repertoire of mice from embryos to young adults. Next, we introduce a statistical model of hypermutation in B cells and infer the parameters from a human repertoire, providing a principled quantification of the biases in hypermutation rates. Lastly, we examine the statistics of the receptors shared amongst a cohort of more than 600 individual humans and show that the statistics and identities of so-called ‘public’ sequences are determined directly from Pgen. We highlight possible clinical applications and attempt to place this work in the context of a full theory of the adaptive immune system.
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog:
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Physics

Files in This Item:
File Description SizeFormat 
Sethna_princeton_0181D_12729.pdf4.3 MBAdobe PDFView/Download

Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.