Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01n583xx98d
Title: Contagion in Online Social Media: “Lies, Damned Lies, and Statistics” A Networked Approach to Trolls, Tweets, and Phishing in the Digital Age
Authors: Hutchinson, Reed
Advisors: Katz, Joshua T
Department: Independent Concentration
Certificate Program: Linguistics Program
Class Year: 2020
Abstract: Have the fraud, deception, and lies overwhelming social media become a virus that cannot be contained? As a result of the lack of trust that now exists online, one of Silicon Valley’s leading software executives just declared that “Facebook is the cigarette” of a new generation.2 Over the course of last year alone, Facebook was forced to delete more than 3.2 billion fake accounts, twice the number from the previous year.3 Not long ago academics including Jeff Hancock from Stanford’s Media Lab were delivering high profile TED Talks proclaiming the safety and integrity of social media.4 Prominent scholars in the fields of linguistics and computer science followed Hancock’s research arguing that the “searchability and permanence” of information in the “Digital Age” would promote honesty.5 If the permanence of records is supposed to bolster authenticity, then why has deception continued to increase in social media and socalled ‘SMS language’? Instead, computer mediated communication (“CMC”) has become infected by troll farms that leverage digital advertising, ubiquitous financial phishing scams, and increasingly, Presidential tweets that include demonstrably false statements. Has the internet become the perfect host to exploit the cognitive “truth bias” for reliable human communication that for centuries was identified by philosophers from Socrates to Aristotle?6 This paper will examine the research surrounding the ability to detect deception in CMC. In addition, it will address the theories behind early software programs that were meant to identify authentic online messages including the algorithms underlying Linguistic Inquiry and Word Count (“LIWC”).7 The paper will revisit the conclusions from data and computer analysis in my research from the Spring ’19 that evaluated LIWC using data from Amazon’s survey tool Mechanical Turk. An IRB approved survey using the Princeton Survey Research Center (PSRC) and guidance from the Pew Research Center is included to consider the ‘truth bias’ in active users of social media platforms. Whether developments in latent semantic analysis and natural language processing provide a more effective computational language processing method using the underlying (latent) meaning or concept in messages is examined. Lastly, the paper explores how detection of false information on social media apps may be aided by the use of knowledge graphs, real time machine learning, and the neural networks built to deploy artificial intelligence.8 How can we find out what is in these ‘cigarettes’ and what consumers are really smoking online? Ultimately, this research is concerned with the accuracy of detection methods and automated linguistic techniques that differentiate between deceptive and truthful electronic communication.
Type of Material: Princeton University Senior Theses
Language: en
Appears in Collections:Independent Concentration, 1972-2020
Aeronautical Engineering, 1945-1975

Files in This Item:
File Description SizeFormat 
HUTCHINSON-REED-THESIS.pdf2.79 MBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.