Open Source Projects

Bagpipe verifies Border Gateway Protocol (BGP) router configurations, and identified 19 inconsistencies in over 240 KLOC of industrial configurations. We also developed the first mechanized formal semantics of BGP, which we used to verify Bagpipe, and to formalize Gao & Rexford’s proof on the convergence of BGP (revealing a necessary extension). See website, OOPSLA’16, NetPL’16, tech report.

SpaceSearch is a Coq library that enables the verification of solver-aided tools. We used SpaceSearch to build and verify: Bagpipe, and SaltShaker which identified 7 bugs in the RockSalt and 1 bug in the STOKE x86 semantics. See website, paper.

Cosette is a framework for solving SQL query equivalences, using the Coq proof assistant and the Rosette solver-aided language. See website, CIDR’17 (to appear), arXiv.

Error Prone is a static analysis tool for Java that catches programming mistakes at compile-time. During my Google internship, I extended error prone with a dataflow analysis framework, which is run at Google with every Java code commit. See error prone website, and some of my commits here, here, here.

A Type System for Format Strings (which are passed to printf). On 6 large and well-maintained open-source Java projects, format string bugs were common (our type system found 104 bugs), and can be avoided with little effort using our type system (on average, for every bug found, only 1.0 type system annotations need to be written). See ISSTA’14, ISSTA’14 (demo), manual.

Hardware Assisted Guest-Hypervisor Information Sharing for the Linux Kernel Virtual Machine (KVM) on IBM’s System Z. During an IBM internship, I extended the Linux KVM to use guest information about memory usage and spinlocks to improve memory allocation and scheduling decisions.

One important advantage of virtualizing a computer’s physical resources is that it enables resource overcommitment (more resources are promised than can be delivered, in the hope that they will not actually be used). Traditionally, memory overcommitment has been handled by lazily assigning memory to the virtualized guest, dynamically redistributing the guests’ memory, or swapping out guest memory. These techniques have problems, that we overcame by sharing more information about memory usage between the hypervisor (Linux KVM) and guest (any Operating System), in an approach called Collaborative Memory Management (CMM). The code is part of the Linux kernel, see this patch, and this follow up patch.

A spinlock is a commonly used locking primitive that spins in an idle loop until the accessed resource becomes unlocked. In such a scenario, holding a spinlock can waste large amounts of CPU time. We overcame this problem by sharing information about held spinlocks between the hypervisor and guest, which allows the hypervisor to make better CPU scheduling decisions. The code is part of the Linux kernel, see this patch.


Publications

Blog Posts

HoTTSQL: Proving Query Rewrites with Univalent SQL Semantics

SQL is the lingua franca for retrieving structured data. Existing semantics for SQL, however, either do not model crucial features of the language (e.g., relational algebra lacks bag semantics, correlated subqueries, and aggregation), or make it hard to formally reason about SQL query rewrites (e.g., the SQL standard’s English is too informal). This post focuses on the ways that HoTT concepts (e.g., Homotopy Types, the Univalence Axiom, and Truncation) enabled us to develop HoTTSQL — a new SQL semantics that makes it easy to formally reason about SQL query rewrites. Our paper also details the rich set of SQL features supported by HoTTSQL.

Read more ...

(Dis)advantages of using Abstract Data Types in Proof Assistants

This blog post explains the advantages and disadvantages of using Abstract Data Types (ADTs, see TaPL Chapter 24) in a Proof Assistant. On the plus side, ADTs promote data representation independence, code reuse, and clean code extraction; but they also do not support fix/match syntax, have to expose derived operations, and prohibit computational reasoning.

Read more ...

Machine Checkable Pictorial Mathematics for Category Theory

Pictures are essential to conveying proofs and definitions (Pinto and Tall, 2002), and are thus common in traditional pen and paper mathematics. While proof assistants offer several benefits over pen and paper mathematics, proofs and definitions in proof assistants are entirely textual (i.e. lack images), and are thus often harder to convey than their traditional equivalents. This blog post changes this unsatisfactory situation by showing how to write pictorial definitions in a proof assistant. Specifically, this blog post shows how to write commutative diagrams for category theory in Coq.

Read more ...

3 Shocking Ways to Break Mathematics

This post provides formal proofs of three paradoxes that occur in type systems with unrestricted recursion, negative data types, or type in type (Girard’s paradox). The title is a pun on BuzzFeedy headlines.

Read more ...

Abstract Data Types and Object Oriented Programming in Coq (Part I)

This post describes how to use data abstractions, specifically Abstract Data Types (ADTs) and Object Oriented Programming (OOP), to hide a value’s representation in a functional programming language with dependent types.

Read more ...

Connections between Natural Numbers, Types, Sets, and Propositions

This post shows connections between natural numbers, types, sets and propositions. This post draws ideas from the Curry Howard correspondence which shows a connection between functional programs and mathematical proofs, and Chris Taylor’s blog post which shows a connection between natural numbers and Haskell types.

Read more ...