DopCert is a framework developed using the Coq Proof Assistant for formally verifing query optimizations in databases. It consists of the following parts: 1) HoTTSQL, a SQL like language that covers all major SQL features, including selection-projection-join, aggregation, correlated subqueries, and indexes. We provide a machine checkable denotational semantics of HoTTSQL using Homotopy Type Theory for proving rewriting rules. 2) A library that consists of building blocks of proofs and automatic decision procedures (e.g., decision procedure for conjunctive queries). 3) Machine checkable proofs for existing rewrite rules from database literature as well as real-world optimizers, ranging from basic ones such as selection push down to complex ones such as magic sets rewrites.
For more information, visit the DopCert website.
Internet Service Providers (ISPs) rely on the Border Gateway Protocol (BGP) to exchange routing information, which is necessary to deliver traffic over the Internet. The BGP behavior of an ISP’s routers is determined by their configurations. Router misconfigurations have caused major monetary loss, performance degradation, service outages, and security violations. Some of the effects of misconfiguration are highly visible, such as the worldwide extended downtime of YouTube in 2009 and route hijacks by China Telecom in 2010 and 2014. Less visible is the high cost of developing and maintaining correct configurations, which requires checking invariants across hundreds of thousands of lines of configuration for all of an ISP’s routers.
Bagpipe is a tool which enables an ISP to express its BGP policy in a domain-specific specification language and verify that its router configurations implement this policy. We evaluated Bagpipe on Internet2 and Selfnet, two ISPs with a combined total of over 100,000 lines of router configuration. We identified and expressed policies for these ISPs, and found 19 inconsistencies between the policies and the router configurations without issuing any false positives.
For more information, visit the bagpipe website.
Error-prone is a Java compiler extension that is used in Google’s internal build system to eliminate classes of serious bugs from entering the code base.
Error-prone checks used to be based on AST matching, which made it hard to write
checks that require flow-sensitive information.
For example, one might want to check that
a field access does not happen with a
or that a lock is released after it is acquired.
During my Google internship, I ported the Checker Framework’s dataflow framework to error-prone (the port required bug fixes, performance improvements, feature enhancements, and infrastructure changes).
It is now easy to write flow-sensitive error-prone checks. Some of these checks are run at Google with every Java code commit.
Most programming languages support format strings, but their use is error-prone. Using the wrong format string syntax, or passing the wrong number or type of arguments, leads to unintelligible text output, program crashes, or security vulnerabilities.
In this project, we developed a type system that guarantees that calls to format string APIs will never fail. In Java, this means that the API will not throw exceptions. In C, this means that the API will not return negative values, corrupt memory, etc.
We instantiated this type system for Java’s Formatter API, and evaluated it on 6 large and well-maintained open-source projects. Format string bugs are common in practice (our type system found 104 bugs), and the annotation burden on the user of our type system is low (on average, for every bug found, only 1.0 annotations need to be written).
One important advantage of virtualizing a computer’s physical resources is that it enables resource overcommitment (more resources are promised than can be delivered, in the hope that they will not actually be used).
Traditionally, memory overcommitment has been handled by lazily assigning memory to the virtualized guest, dynamically redistributing the guests’ memory, or swapping out guest memory. These techniques have problems, that we overcame by sharing more information about memory usage between the hypervisor (Linux KVM) and guest (any Operating System), in an approach called Collaborative Memory Management (CMM). The code is part of the Linux kernel (see this patch, and this follow up patch).
A spinlock is a commonly used locking primitive that spins in an idle loop until the accessed resource becomes unlocked. If the CPU is overcommited (there are more virtual cores than physical cores) not all virtual cores can run at the same time. In such a scenario, holding a spinlock can waste large amounts of CPU time. We overcame this problem by sharing information about held spinlocks between the hypervisor and guest, which allows the hypervisor to make better CPU scheduling descisions. The code is part of the Linux kernel (see this patch).
“Formal Semantics & Verification for the Border Gateway Protocol” (Tech Report) by Konstantin Weitz, Doug Woos, Arvind Krishnamurthy, Michael D. Ernst, and Zachary Tatlock.
“Scalable Verification of Border Gateway Protocol Configurations with an SMT Solver” by Konstantin Weitz, Doug Woos, Emina Torlak, Michael D. Ernst, Arvind Krishnamurthy, and Zachary Tatlock. In OOPSLA 2016.
“Formal Semantics and Automated Verification for the Border Gateway Protocol” by Konstantin Weitz, Doug Woos, Emina Torlak, Michael D. Ernst, Arvind Krishnamurthy, and Zachary Tatlock. In NetPL 2016.
“HoTTSQL: Proving Query Rewrites with Univalent SQL Semantics (arXiv)” by Shumo Chu, Konstantin Weitz, Alvin Cheung and Dan Suciu.
“Real-Time Collaborative Analysis with (Almost) Pure SQL: A Case Study in Biogeochemical Oceanography” by Daniel Halperin, Francois Ribalet, Konstantin Weitz, Mak A. Saito, Bill Howe, and E. Virginia Armbrust. In SSDBM 2013.
Download: PDF, BibTeX Entry, Talk
This blog post explains the advantages and disadvantages of using Abstract Data Types (ADTs, see TaPL Chapter 24) in a Proof Assistant. On the plus side, ADTs promote data representation independence, code reuse, and clean code extraction; but they also do not support fix/match syntax, have to expose derived operations, and prohibit computational reasoning.Read more ...
Pictures are essential to conveying proofs and definitions (Pinto and Tall, 2002), and are thus common in traditional pen and paper mathematics. While proof assistants offer several benefits over pen and paper mathematics, proofs and definitions in proof assistants are entirely textual (i.e. lack images), and are thus often harder to convey than their traditional equivalents. This blog post changes this unsatisfactory situation by showing how to write pictorial definitions in a proof assistant. Specifically, this blog post shows how to write commutative diagrams for category theory in Coq.Read more ...
This post provides formal proofs of three paradoxes that occur in type systems with unrestricted recursion, negative data types, or type in type (Girard’s paradox). The title is a pun on BuzzFeedy headlines.Read more ...
This post describes how to use data abstractions, specifically Abstract Data Types (ADTs) and Object Oriented Programming (OOP), to hide a value’s representation in a functional programming language with dependent types.Read more ...
This post shows connections between natural numbers, types, sets and propositions. This post draws ideas from the Curry Howard correspondence which shows a connection between functional programs and mathematical proofs, and Chris Taylor’s blog post which shows a connection between natural numbers and Haskell types.Read more ...