Project Information

40% of the course grade is based on your performance on a project. This 40% is divided into several components, listed by their absolute percentage on the final course grade:

Proposal and Formulating a Test Plan with ACC (4%). You will select a codebase to work with, provide some minimal background details, and describe what you're planning to do with it. More details are in the next section of this document.
Unit Testing with Sufficient Coverage (12%). Your proposal lays out capabilities that you're planning to apply unit tests to. This portion of the grade is based on how well you test your capabilities. As part of this, you must demonstrate that you have acheived good code coverage of your capabilities, ideally 90+%. (It's difficult to measure this part exactly, because your capabilities may involve code in other files unrelated to your testing, etc.) If possible, you should also apply mutation testing to see if your testing is sufficient.
Automated Testing (15%). You must apply some sort of automated testing approach to the capability(ies) listed in your proposal. As part of this, your capability(ies) must be complex enough to warrant an automated testing approach. You may also apply verification approaches for this instead, though this isn't recommended.
Pull Requests (5%). You are required to submit 5 pull requests once the projects are underway, one about every two weeks, each worth 1%. At a bare minimum, each pull request must contain collectively at least 100 lines of code, documentation, or both. Each pull request is basically a statement of “this is what I've been working on for the project recently”. These are here to get you familiar with submitting pull requests, and to make sure steady progress is being made. You cannot simply submit these in batch at the end; the point with these points is to reward steady progress.
Final Report (4%). At the end, you'll submit a final report going over your findings and discussing what you've learned.

Project Proposal

Your project proposal should conform to this template; more details are in that document. This example proposal gives a sense of what I'm looking for. You may use this example as-is, if you so choose, but note that it's very large; this would qualify for the full 25% bonus.

The project proposal asks you to select some capabilities for automated testing. You are being asked to make a decision with incomplete information; it is wholly expected that this information will need to be revised later. In the meantime, some minimal details follow about some of the techniques we will discuss later in the class. Instead of an automated testing approach, you may select a verification approach as well, but be advised that these will likely be more difficult, and we won't be covering them until late in the class. Bonus is available if you select a verification-based approach.

Basic Automated Test Generation
The idea here is that we can run what is fundamentally the same test with different inputs, and these inputs can be generated in a straightforward manner. This allows us to run the same kind of test under slightly different parameters, helping to expose edge cases. JUnit 5 (a popular testing framework for Java programs) makes this possible with Parameterized Tests. These links are provided as examples of this technique; the particular way in which JUnit does this has a number of problems associated with it.
Grammar-Based Test Generation
The previously-described approach doesn't scale up well when testing operations which work with complex data. For example, when testing an API that works with Binary Trees, it's not clear how to automatically generate different kinds of trees using only a loop. However, binary trees can be described using Context-Free Grammars. We can write generators of sentences which behave according to a context-free grammars, and thus write binary tree generators. We can add randomness to the generation by using Probabilistic Context-Free Grammars. A surprising number of testing problems can be (at least mostly) phrased using context-free grammars.
Property-Based Testing
We can describe systems in terms of properties we expect them to have, and these properties can be expressed with executable code. For example, if we are writing a function add(x, y) which is intended to add x and y together, then one property may be add(x, y) == add(y, x) (that is, addition is commutative). From a high level, it is straightforward to go from properties to tests, as long as we have a way of filling in variables with appropriate data. With the example above, this means trying different values of x and y, and ensuring that the property is never violated. Testing frameworks exist that can assist with property-based testing, including:
- QuickCheck (originally for Haskell, but since ported to many other languages). QuickCheck pioneered property-based testing.
- Hypothesis (for Python and Java)
- ScalaTest (for Scala)
Ideally, the testing framework can automatically handle the generation of inputs, but in many cases it may be required to somehow assist the framework in this task.
Concolic Execution
Concolic Execution is an automated, Write-box technique that attempts to generate a variety of test inputs which collectively exhaustively test a given software system. In practice, most software systems would require an infinite number of inputs for full exhaustive testing, so certain compromises must be made. A variety of tools exist which can perform concolic execution; see the link with more details. Getting these tools working correctly, and having them generate useful tests, can require a significant amount of effort; this technique is not as automated as it first may seem.
Model Checking
Model Checking is generally intended for testing software at the specification level, not the implementation level. Specifications are written in special languages, and variants of Linear Temporal Logic are popular for this purpose. Some tools allow for verification (proving the specification correct), as opposed to just testing.
Verification: Dafny
Dafny is a relatively basic programming language with built-in verification functionality. The goal is to write programs which can be proven correct, according to a given specification. Unlike model checking, this verification extends to the implementation, not just the specification. While the Dafny language itself is very simple, there is a definite learning curve to getting it to verify even seemingly simple things. For example, it took me:
- 20 minutes to verify that a routine which reversed the contents of an array actually reversed the contents.
- 1 hour to verify that code which concatenated two arrays together actually concatenated the two arrays.
- 3-4 hours to verify that an implementation of Selection Sort actually sorted its input.
While Dafny has a pretty steep learning curve, it's surprisingly easy to pick up relative to other verification approaches. However, Dafny is relatively young, and the incredibly high complexity of its implementation makes it difficult to trust.
Verification: Dependent Types and Coq
Dependent Type Theory allows us to encode a Type System which is incredibly expressive. We can then phrase abstract questions like “Does this program work correctly?” in an equivalent, much more concrete “Does this program have any type errors?” Dependent types are commonly used in programming languages research to verify properties of programming languages, and Coq is a particularly popular and well-established tool for working with dependent types. This verification approach is mature, relatively easy to trust, and behaves in a manner which can be predicted. However, this requires an exceptional amount of background information, and it can be downright frustrating to learn. Unless you're already familiar with this approach, verifying even trivial things can be a challenge (I once spent 12 hours trying to prove that a + (b + c) was equal to (a + b) + c in Coq, and I eventually gave up.)

Note that the above list is not exhaustive; you may select something that's not in this list.;

Project Report Information

You must also write a short project report, which discusses what you've done and what you'd do differently if you could do it all over again. A template is available here. The main purpose of the report is to explain exactly what you've done. I will use this report during grading to confirm that you've done what you set out to do in your proposal.

If you'd prefer, I'll also accept a video of you going through your code and explaining your changes as evidence. If you take this route, you still need to explain the same content that the template expects.