40% of the course grade is based on your performance on a project. This 40% is divided into several components, listed by their absolute percentage on the final course grade:
Your project proposal should conform to this template; more details are in that document. This example proposal gives a sense of what I'm looking for. You may use this example as-is, if you so choose, but note that it's very large; this would qualify for the full 25% bonus.
The project proposal asks you to select some capabilities for automated testing. You are being asked to make a decision with incomplete information; it is wholly expected that this information will need to be revised later. In the meantime, some minimal details follow about some of the techniques we will discuss later in the class. Instead of an automated testing approach, you may select a verification approach as well, but be advised that these will likely be more difficult, and we won't be covering them until late in the class. Bonus is available if you select a verification-based approach.
The idea here is that we can run what is fundamentally the same test with different inputs, and these inputs can be generated in a straightforward manner. This allows us to run the same kind of test under slightly different parameters, helping to expose edge cases. JUnit 5 (a popular testing framework for Java programs) makes this possible with Parameterized Tests. These links are provided as examples of this technique; the particular way in which JUnit does this has a number of problems associated with it.
The previously-described approach doesn't scale up well when testing operations which work with complex data. For example, when testing an API that works with Binary Trees, it's not clear how to automatically generate different kinds of trees using only a loop. However, binary trees can be described using Context-Free Grammars. We can write generators of sentences which behave according to a context-free grammars, and thus write binary tree generators. We can add randomness to the generation by using Probabilistic Context-Free Grammars. A surprising number of testing problems can be (at least mostly) phrased using context-free grammars.
We can describe systems in terms of properties we expect them to have, and these properties can be expressed with executable code.
For example, if we are writing a function add(x, y)
which is intended to add x
and y
together, then one property may be add(x, y) == add(y, x)
(that is, addition is commutative).
From a high level, it is straightforward to go from properties to tests, as long as we have a way of filling in variables with appropriate data.
With the example above, this means trying different values of x
and y
, and ensuring that the property is never violated.
Testing frameworks exist that can assist with property-based testing, including:
Ideally, the testing framework can automatically handle the generation of inputs, but in many cases it may be required to somehow assist the framework in this task.
Concolic Execution is an automated, Write-box technique that attempts to generate a variety of test inputs which collectively exhaustively test a given software system. In practice, most software systems would require an infinite number of inputs for full exhaustive testing, so certain compromises must be made. A variety of tools exist which can perform concolic execution; see the link with more details. Getting these tools working correctly, and having them generate useful tests, can require a significant amount of effort; this technique is not as automated as it first may seem.
Model Checking is generally intended for testing software at the specification level, not the implementation level. Specifications are written in special languages, and variants of Linear Temporal Logic are popular for this purpose. Some tools allow for verification (proving the specification correct), as opposed to just testing.
Dafny is a relatively basic programming language with built-in verification functionality. The goal is to write programs which can be proven correct, according to a given specification. Unlike model checking, this verification extends to the implementation, not just the specification. While the Dafny language itself is very simple, there is a definite learning curve to getting it to verify even seemingly simple things. For example, it took me:
While Dafny has a pretty steep learning curve, it's surprisingly easy to pick up relative to other verification approaches. However, Dafny is relatively young, and the incredibly high complexity of its implementation makes it difficult to trust.
Dependent Type Theory allows us to encode a Type System which is incredibly expressive.
We can then phrase abstract questions like “Does this program work correctly?” in an equivalent, much more concrete “Does this program have any type errors?”
Dependent types are commonly used in programming languages research to verify properties of programming languages, and Coq is a particularly popular and well-established tool for working with dependent types.
This verification approach is mature, relatively easy to trust, and behaves in a manner which can be predicted.
However, this requires an exceptional amount of background information, and it can be downright frustrating to learn.
Unless you're already familiar with this approach, verifying even trivial things can be a challenge (I once spent 12 hours trying to prove that a + (b + c)
was equal to (a + b) + c
in Coq, and I eventually gave up.)
Note that the above list is not exhaustive; you may select something that's not in this list.;
You must also write a short project report, which discusses what you've done and what you'd do differently if you could do it all over again. A template is available here. The main purpose of the report is to explain exactly what you've done. I will use this report during grading to confirm that you've done what you set out to do in your proposal.
If you'd prefer, I'll also accept a video of you going through your code and explaining your changes as evidence. If you take this route, you still need to explain the same content that the template expects.