Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp010z708w54n
Title: A Computational Tool for Enhancing the Quality of Translated Documents Constructed from Individual Human-Translated Sentences
Authors: Yu, Keunwoo
Advisors: Mimno, David
Department: Computer Science
Class Year: 2013
Abstract: The aim of this thesis is to develop a computational tool that would help curate documents that have been translated via crowd-sourced translation. The computational tool specifically focuses on a problem called “proper noun translation disagreement” or “named entity translation disagreement” in which translators use different translations for the same proper noun or named entity. In the process, the thesis presents various ways to preprocess training corpora and algorithms to improve the quality of Korean-English word-level alignment. A user interface is also introduced, which allows users to identify grammatical problems and correct them with ease. In trying to address issues that arise during the implementation of the computational tool, this thesis draws from previous research in statistical natural language processing for Western languages, as well as Korean and Chinese. There has not been much research in statistical natural language processing specifically geared towards crowdsourced translation. This thesis employs various techniques and algorithms developed by researchers focusing on the problem of bilingual alignment, and attempts to shed light on the various issues of crowd-sourced translation.
Extent: 56 pages
URI: http://arks.princeton.edu/ark:/88435/dsp010z708w54n
Access Restrictions: Walk-in Access. This thesis can only be viewed on computer terminals at the Mudd Manuscript Library.
Type of Material: Princeton University Senior Theses
Language: en_US
Appears in Collections:Computer Science, 1987-2023

Files in This Item:
File SizeFormat 
Keunwoo Peter Yu.pdf943.04 kBAdobe PDF    Request a copy


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.