PhD Done!!!

My PhD on Declarative Programming Via Term Rewriting, which I completed at Johns Hopkins University, is done.

This research project was about developing a declarative, weighted logic programming language for machine learning, artificial intelligence, and natural language processing applications. The programming language we were researching was interesting because it allowed for programs that do interesting probabilistic and symbolic reasoning to be concisely expressed in a few lines of code. It accomplishes this by allowing the programmer to leave out as many details as possible about how the program is executed. This is similar to a database, where the database needs to automatically figure out how to retrieve the data given a high-level declarative query.

To make this work, I created a relational algebra, which is capable of representing entire programs. To make the system flexible enough to find a good execution strategy, I created a term rewriting approach that includes hundreds of rewrite rules to run the program. This is similar to rewriting an expression like “2 + 3” as “5” where both of these expressions are semantically equivalent.

To make the term rewriting relational algebra approach tractable, I additionally had to redesign many of the traditional approaches that programming languages are implemented. For example, I created a new way to think about memoization (dynamic programming) to make it work with our system. Additionally, I created a (JIT) compiler for our term rewrite system because the naive implementation was too slow for real world use.

In the end, this was an interesting research project. However, I think that this work was set a bit too firmly in the realm of symbolic systems for AI (the AI paradigm of yesteryear). Hence, I do not know if this is applicable to modern only big neural AI that is dominating. Eventually, I do think that this work may see some use. The reason is that while pure neural creates really cool demonstrations, it will also fabricate information. This creates an issue when these systems are deployed into applications, and that is a problem for their usability in industry. Hence, having a system that incorporates weighted reasoning (necessary for neural networks), and symbolic reasoning into a single system is a very powerful programming paradigm.

The dissertation document and recording of the defense are available at:

Dissertation Abstract

I present a new approach to implementing weighted logic programming languages. I first present a bag-relational algebra that is expressive enough to capture the desired denotational semantics, directly representing the recursive conjunctions, disjunctions, and aggregations that are specified by a source program. For the operational semantics, I develop a term-rewriting system that executes a program by simplifying its corresponding algebraic expression.

I have used this approach to create the first complete implementation of the Dyna programming language. A Dyna program consists of rules that define a potentially infinite and cyclic computation graph, which is queried to answer data-dependent questions. Dyna is a unified declarative framework for machine learning and artificial intelligence researchers that supports dynamic programming, constraint logic programming, reactive programming, and object-oriented programming. I have further modernized Dyna to support functional programming with lambda closures and embedded domain-specific languages.

The implementation includes a front-end that translates Dyna programs to bag-relational expressions, a Python API, hundreds of term rewriting rules, and a procedural engine for determining which rewrite rules to apply. The rewrite rules generalize techniques used in constraint logic programming. In practice, our system is usually able to provide simple answers to queries.

Mixing disparate programming paradigms is not without challenges. We had to rethink the classical techniques used to implement logic programming languages. This includes the development of a novel approach for memoization (dynamic programming) that supports partial memoization of fully or partially simplified algebraic expressions, which may contain delayed, unevaluated constraints. Furthermore, real-world Dyna programs require fast and efficient execution. For this reason, I present a novel approach to just-in-time (JIT) compile sequences of term rewrites using a custom tracing JIT.