My research done during my PhD at Johns Hopkins can be found on my academic website here:

Research From Ph.D. Studies at Johns Hopkins

Declarative Programming Via Term Rewriting (PhD dissertation) [Dissertation Document]

The code will be posted online soon.

I present a new approach to implementing weighted logic programming languages. I first present a bag-relational algebra that is expressive enough to capture the desired denotational semantics, directly representing the recursive conjunctions, disjunctions, and aggregations that are specified by a source program. For the operational semantics, I develop a term-rewriting system that executes a program by simplifying its corresponding algebraic expression.

I have used this approach to create the first complete implementation of the Dyna programming language. A Dyna program consists of rules that define a potentially infinite and cyclic computation graph, which is queried to answer data-dependent questions. Dyna is a unified declarative framework for machine learning and artificial intelligence researchers that supports dynamic programming, constraint logic programming, reactive programming, and object-oriented programming. I have further modernized Dyna to support functional programming with lambda closures and embedded domain-specific languages.

The implementation includes a front-end that translates Dyna programs to bag-relational expressions, a Python API, hundreds of term rewriting rules, and a procedural engine for determining which rewrite rules to apply. The rewrite rules generalize techniques used in constraint logic programming. In practice, our system is usually able to provide simple answers to queries.

Mixing disparate programming paradigms is not without challenges. We had to rethink the classical techniques used to implement logic programming languages. This includes the development of a novel approach for memoization (dynamic programming) that supports partial memoization of fully or partially simplified algebraic expressions, which may contain delayed, unevaluated constraints. Furthermore, real-world Dyna programs require fast and efficient execution. For this reason, I present a novel approach to just-in-time (JIT) compile sequences of term rewrites using a custom tracing JIT.

Evaluation of Logic Programs with Built-Ins and Aggregation: A Calculus for Bag Relations [paper] [Code]

Authors: Matthew Francis-Landau, Tim Vieira, Jason Eisner

We present a scheme for translating logic programs, which may use aggregation and arithmetic, into algebraic expressions that denote bag relations over ground terms of the Herbrand universe. To evaluate queries against these relations, we develop an operational semantics based on term rewriting of the algebraic expressions. This approach can exploit arithmetic identities and recovers a range of useful strategies, including lazy strategies that defer work until it becomes possible or necessary.

Exact and/or Fast Nearest Neighbors [Paper] [code]

Authors: Matthew Francis-Landau, Benjamin Van Durme

Prior methods for retrieval of nearest neighbors in high dimensions are fast and approximate–providing probabilistic guarantees of returning the correct answer–or slow and exact performing an exhaustive search. We present Certified Cosine, a novel approach to nearest-neighbors which takes advantage of structure present in the cosine similarity distance metric to offer certificates. When a certificate is constructed, it guarantees that the nearest neighbor set is correct, possibly avoiding an exhaustive search. Certified Cosine’s certificates work with high dimensional data and outperform previous exact nearest neighbor methods on these datasets.

MFST: A Python OpenFst Wrapper With Support for Custom Semirings and Jupyter Notebooks [Paper] [Code]

Authors: Matthew Francis-Landau

This paper introduces mFST, a new Python library for working with Finite-State Machines based on OpenFst. mFST is a thin wrapper for OpenFst and exposes all of OpenFst’s methods for manipulating FSTs. Additionally, mFST is the only Python wrapper for OpenFst that exposes OpenFst’s ability to define a custom semirings. This makes mFST ideal for developing models that involve learning the weights on a FST or creating neuralized FSTs. mFST has been designed to be easy to get started with and has been previously used in homework assignments for a NLP class as well in projects for integrating FSTs and neural networks. In this paper, we exhibit mFST API and how to use mFST to build a simple neuralized FST with PyTorch.

Dyna: Toward a Self-Optimizing Declarative Language for Machine Learning Applications [Paper] [Slides]

Authors: Tim Vieira, Matthew Francis-Landau, Nathaniel Wesley Filardo, Farzad Khorasani, Jason Eisner

Declarative programming is a paradigm that allows programmers to specify what they want to compute, leaving how to compute it to a solver. Our declarative programming language, Dyna, is designed to compactly specify computations like those that are frequently encountered in machine learning. As a declarative language, Dyna’s solver has a large space of (correct) strategies available to it. We describe a reinforcement learning framework for adaptively choosing among these strategies to maximize efficiency for a given workload. Adaptivity in execution is especially important for software that will run under a variety of workloads, where no fixed policy works well. We hope that reinforcement learning will identify good policies reasonably quickly—offloading the burden of writing efficient code from human programmers.

Fine-grained parallelism in probabilistic parsing with Habanero Java [Paper] [slides] [source code]

Authors: Matthew Francis-Landau, Bing Xue, Jason Eisner, Vivek Sarak

Structured prediction algorithms—used when applying machine learning to tasks like natural language parsing and image understanding—present some opportunities for fine-grained parallelism, but also have problem-specific serial dependencies. Most implementations exploit only simple opportunities such as parallel BLAS, or embarrassing parallelism over input examples. In this work we explore an orthogonal direction: using the fact that these algorithms can be described as specialized forward-chaining theorem provers and implementing fine-grained parallelization of the forward-chaining mechanism. We study context-free parsing as a simple canonical example, but the approach is more general.

Research From Undergraduate Studies at UC Berkeley

Capturing Semantic Similarity for Entity Linking with Convolutional Neural Networks [Paper] [Poster] [Source code]

Authors: Matthew Francis-Landau, Greg Durrett, and Dan Klein.

A key challenge in entity linking is making effective use of contextual information to disambiguate mentions that might refer to different entities in different contexts. We present a model that uses convolutional neural networks to capture semantic correspondence between a mention’s context and a proposed target entity. These convolutional networks operate at multiple granularities to exploit various kinds of topic information, and their rich parameterization gives them the capacity to learn which n-grams characterize different topics. We combine these networks with a sparse linear model to achieve state-of-the-art performance on multiple entity linking datasets, outperforming the prior systems of Durrett and Klein (2014) and Nguyen et al. (2014).

Accepted to NAACL 2016.

DJ: Distributed JIT [Documentation] [Source code] [Preprint paper]

DJ automatically rewrites Java programs that are written for a single host to be able to operate in a distributed environment.   The system works by rewriting Java bytecode and providing a runtime environment that supports features such as remote memory access, distributed locks, etc.  Finally, the programs distributed state is controlled by a “JIT” which is separate from the core DJ system, as such, it can be switched by a user to change the behavior of the distributed program.

Slides from September 2015 technical presentation

Gang scheduling for predictable performance [Paper] [Poster]

Targeted issues of scheduling High-Performance Computing (HPC) applications in Cloud Computing environments.  Developed a Gang Scheduler for the XEN hypervisor.  This ensures that only one virtual machine is running for any set of competing resources.

Post project:  Developed an extremely simplified Gang scheduler for the Linux Kernel as a point for comparison with XEN.  (source)