Making Parsers More Accessible

Ph.D. candidate Alan Leung (M.S. ’13) says he likes to create tools that “make it easier to build complex systems reliably.” These include a tool for automating the construction of parsers — programs that extract structure from strings — for context-free languages. Now, Leung is poised for the final defense of his dissertation on making parsers more accessible. He will defend his thesis in front of a panel chaired by his advisor in the Programming Languages group, CSE professor Sorin Lerner. The panel also includes CSE professors Ranjit Jhala and Ryan Kastner, as well as Math professor Samuel Buss and UCLA computer science professor Todd Millstein.
Alan Leung

The title of Leung’s dissertation is “Constructing Parsers by Example via Interactive Program Synthesis,” and his defense is scheduled for Thursday, May 11 at 2PM in room 2217 of the CSE building.  The examination is open to the public.

Parsers are fundamental components of many software systems, including email clients, video games, spreadsheet programs, and relational databases. As a result, constructing parsers has become a ubiquitous programming task for developers in many domains, and not just for programming language experts.

According to Leung, existing tools for generating parsers assume a great deal of background knowledge in parsing and formal language theory, but “it is possible to make parsing more accessible by combining interactive visual feedback with the programming-by-example paradigm, wherein users synthesize programs simply by providing example inputs and outputs demonstrating the result of the intended computation.”

In his dissertation, Leung presents novel algorithms for (a) constructing syntactic specifications by example, (b) constructing lexical analyses by example, and (c) visualizing progress toward parser completion. “We instantiate these algorithms in two graphical development environments we have implemented,” notes Leung in his abstract, referring to Parsify and its successor, Parsimony. “The latter’s central user interaction paradigm is that of programming-by example.” In a user study, he demonstrates that non-expert users show significantly better performance when using the new system.
Prior to beginning the graduate program in CSE in 2010, Leung worked for five years at Intel designing cache systems on several generations of Itanium microprocessors. Before Intel, he did his undergraduate degree at Cornell University.