Ashish Venkat, Worthy N. Martin (co-chair)
Worthy N. Martin
Title: Accelerating Pattern Recognition Processing with In-Memory Architectures
Newly available memory-centric architectures to accelerate finite automata-based pattern recognition have motivated the use of automata processing in new applications, such as bioinformatics, machine learning, and natural language processing. However, the existing automata processing solutions, including von Neumann and these new in-memory accelerators, are neither scalable nor efficient. Moreover, existing in-memory automata processing accelerators are unable to process more complex pattern structures such as tree and graph-shaped patterns.
This dissertation outlines five new contributions to improve the effectiveness of automata processing. This includes accelerating two applications using automata processing by developing novel algorithms and mapping them to in-memory automata processing accelerators: 1) frequent subtree mining, and 2) rule-based part-of-speech tagging in natural language processing.
This dissertation then presents three novel architecture studies. Based on the preliminary results from the above applications and their natural structures, and also by investigating the interconnect and scalability inefficiency in the existing in-memory automata processing architectures, 3) we propose a reconfigurable high-speed, high-density, and low-power automata processing architecture with a compact, low-overhead, and yet flexible interconnect design. To evaluate our proposed architecture, we develop a cycle-accurate automata processing simulator, flexible enough to be adopted by other researchers for automata processing research.
Motivated by our study on pattern matching in tree-shaped structures, 4) we propose a general-purpose, scalable, and reconfigurable memory-centric architecture for processing complex patterns, such as tree-like data. We take inspiration from previous automata processing architectures, but support the richer deterministic pushdown automata computational model.
Finally, we observed that fixed 8-bit symbol processing, derived from ASCII processing, restricts throughput and area efficiency of existing automata accelerators, causes resource underutilization, and limits the general adoption of applications with very large symbol-set size. To address these issues, 5) we present FlexAmata, a compiler solution that efficiently re-shapes an automaton structure to provide application compatibility with existing and future memory-centric automata processing architectures, and hardware compatibility for big-data domain applications. The former allows execution efficiency and feasibility of applications with very large or very small symbols-sets on memory-centric automata accelerators. Inspired by this exploration, we present a comprehensive design space exploration for different bitwidth automata processing on FPGAs and in-memory approaches. This study provides several insights to design future automata processing accelerators more efficiently.