Skip navigation
Please use this identifier to cite or link to this item: http://arks.princeton.edu/ark:/88435/dsp01np193d30v
Title: Designing Computing Systems Based on Unconventional Technologies for Hardware Acceleration
Authors: Jia, Hongyang
Advisors: Verma, Naveen
Contributors: Electrical Engineering Department
Keywords: Approximate Computing
Hardware Acceleration
In-Memory Computing
Programmability
Scalable Architecture
Subjects: Electrical engineering
Computer science
Computer engineering
Issue Date: 2021
Publisher: Princeton, NJ : Princeton University
Abstract: Hardware specialization is being widely adopted to address the energy and throughput limitations in a range of applications. However, two critical challenges are: (1) degraded programmability; and (2) bottlenecks posed by memory accessing and data movement. This thesis investigates unconventional technologies for computation, enabling unconventional accelerator architectures and associated programmability and physical-design tradeoffs, to overcome these challenges. This employs co-design at the circuit, architectural, and software levels, applied to custom integrated-circuit (IC) prototypes to validate the cross-layer implications. First, the challenge of programmability is explored through opportunities enabled by approximate computing. Accelerator programmability is enhanced by adopting the code-synthesis framework of genetic-programming (GP), which approximates computations from high level specifications (input-output pairs), by using highly structured models of computation, which, in turn, enable accelerator specialization for energy efficiency. A programmable heterogeneous platform for sensor inference is demonstrated, including: a 130nm CMOS IC, integrating a CPU, fixed-function classification accelerator, and programmable feature-extraction accelerator; and a compiler flow for code generation and approximation-aware model training. Next, the challenge of memory accessing and data movement is explored through mixed-signal in-memory computing (IMC), which amortizes accessing of raw bits into accessing of a computational result over all bits in a memory column. This fundamentally increases signal dynamic range, instating an energy/throughput-vs.-SNR tradeoff. A recent approach to high-SNR IMC is exploited to form robust abstractions of the computations, required for architectural integration and software-level interfacing. A programmable heterogeneous processor is demonstrated, including: a 65nm CMOS IC, integrating a CPU, near-memory-computing digital accelerator, and bit-scalable IMC accelerator; and associated programming model and software libraries for neural-network training and mapping. Finally, an architecture and application-mapping algorithms are explored to enable scalability of IMC platforms, especially addressing memory-system energy and latency required for virtualization of IMC hardware. An arrayed dataflow architecture is designed with integrated microarchitectural support for efficient and scalable scheduling and execution of computations for diverse neural-network models. A reconfigurable IMC platform is demonstrated, including: a 16nm CMOS IC, integrating a 4×4 array of IMC modules and scalable network-on-chip; and application-mapping algorithms and toolchain, optimizing energy-efficiency and throughput at the IMC hardware design point.
URI: http://arks.princeton.edu/ark:/88435/dsp01np193d30v
Alternate format: The Mudd Manuscript Library retains one bound copy of each dissertation. Search for these copies in the library's main catalog: catalog.princeton.edu
Type of Material: Academic dissertations (Ph.D.)
Language: en
Appears in Collections:Electrical Engineering

Files in This Item:
File Description SizeFormat 
Jia_princeton_0181D_13747.pdf16.79 MBAdobe PDFView/Download


Items in Dataspace are protected by copyright, with all rights reserved, unless otherwise indicated.