THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE

THIS PAGE ARCHIVES A PAST CS162 CLASS; THIS COURSE IS NOT ACTIVE






Due Date

Due 11:59:59pm Monday, February 23

Description

The goal of this assignment is to become familiar with Prolog, and to gain some experience with logic programming. While you may develop your code using any Prolog implementation you want for this assignment, we recommend using SWI-Prolog, which is available on CSIL via the swipl command. We will be testing your solutions using SWI-Prolog, so make sure your solutions work correctly on SWI-Prolog before submitting via turnin.

For both part A and part B, we use the term “procedure” to refer to the code that you must provide. A procedure is a non-empty set of clauses which have the same name and arity. For example, the between0And5 procedure is shown below, which simply unifies its only argument with a number in the range [0, 5]:

between0And5(0).
between0And5(1).
between0And5(2).
between0And5(3).
between0And5(4).
between0And5(5).

For the sake of this assignment, a single clause also constitutes a procedure. That is, if you are able to write one of the intended procedures using only a single clause, this is still perfectly acceptable. For example, with part B, if there is only one solve clause, this is fine.

Part A: Tutorial

The first part of this assignment is to go through a tutorial covering Prolog, available here. You need not read chapters 7, 13, 14, and 15. We have provided a set of companion exercises for the tutorial here, indexed by tutorial chapter, which must be completed as part of the assignment.

In addition to the tutorial itself, you should also take a look at this code, which implements a simple graph and graph traversal in Prolog. This may prove useful in implementing the ontology portion later in the assignment.

Part B: List Operations

This part of the assignment involves you implementing a series of operations over lists. Download the skeleton code here. You must fill in all the portions labeled ---FILL ME IN---.

Part C: Ontologies

An ontology is a formal description of the knowledge within a domain, complete with ideas and the relationships between said ideas. Ontologies allow for users to rapidly explore a knowledge base, and well-defined representations allow for even automated exploration.

Within the domain of Biology, an ontology of high importance is that of the Gene Ontology (GO). The GO is intended to be a highly descriptive web of knowledge, specific to biological processes (e.g., cell division), cellular components (e.g., organelles), and the functions of related molecules (e.g., protein binding). This structure allows for scientists to rapidly see which different biological components are related to each other and how, along with relevant paper citations. This allows for questions to be answered easily without doing a full literature search, which is important for a fast-moving field with a vast body of knowledge.

The GO can be, and often is, represented as a series of relations between concepts. This is a rather natural representation, given that a significant amount of information is conveyed through connections between concepts, and that this representation is amenable to the incorporation of new information.

This relational representation lends itself to logic programming. Not only does logic programming make the problem of representing the ontology simple, it provides a natural mechanism to query this representation. Overall, the GO can be treated like a larged directed acyclic graph, and queries over the GO often behave as various kinds of graph traversals (you may wish to review this code again from the tutorial).

For this assignment, you will be working with a small subset of the GO, already written as part of a partial logical program. You can download the program here. Over this subset of the GO, you must write the following procedures:

  1. A procedure named toisaroot, which makes a list showing the path from some given concept C to the root, which is always 'cellular component'. For example, the query:

    toisaroot('DNA helicase complex', List).
    
    ...should yield only:
    List = ['DNA helicase complex', 'catalytic complex', 'protein complex', 
            'macromolecular complex', 'cellular component']
    
    Not every concept will have a root, reflecting the fact that not every concept can reach 'cellular component' strictly via is_a relationships. For example, the query:
    toisaroot(cytoplasm, List).
    
    ...should fail, since there is no way to reach 'cellular component' via is_a relationships from cytoplasm.

  2. A procedure named subconcept to determine if a given concept C1 is a subconcept of some concept C2. This is true if at least one of the following holds:

    1. C1 is a part_of C2

    2. C1 is_a C2

    3. There exists some concept C3 that either is_a or is part_of C2 such that C1 is a subconcept of C3. In other words, C1 either is_a or is part_of C2 transitively.

    For example, the query:
    subconcept('intracellular part', X).
    
    ...should yield that X could be one of: 'cell part', intracellular, cell, or 'cellular component'. Duplicates are ok, and even expected, which reflect the fact that there were different ways to derive the same information.

Part D: Solving Sudoku Puzzles

For this portion of the assignment, you will write a program in Prolog that can solve Sudoku puzzles. Sudoku is defined as a 9x9 grid, composed of nine 3x3 subgrids. Initially, the grid is partially filled in with numbers. The goal of Sudoku is to fill in the remaining empty squares of the grid with respect to the following three constraints:

  1. Each 9-square row must contain each digit in the range [1, 9]. No repeats are possible.

  2. Each 9-square column must contain each digit in the range [1, 9]. No repeats are possible.

  3. Each 9-square 3x3 subgrid must contain each digit in the range [1, 9]. No repeats are possible.

We have provided some template code to help solve this, provided here. To solve this problem, fill in the procedure named solve that takes a list of nine lists, where each inner list represents a row in the Sudoku puzzle. The inner lists are all of length nine, where each inner list element is either an integer in the range [1, 9] or an unbound variable. An example query is like so (substituting ... for additional list components):
Board = [[1, _, ...], ...], solve(Board).
The solve procedure binds all the unbound variables in a way that satisfies the above three constraints. Once all variables have been bound, as long as the above three constraints have not been violated, a solution has been reached. As a hint, the built-in predicates var(X) and nonvar(X) may come in handy.

As for implementing a naive solve, the general idea is the following:

  1. Nondeterministically bind each blank space to a number between 1 and 9.
  2. Ensure that the Sudoku constraints are not violated.
    • If they are not violated, then we have a solution.
    • If they are violated, then failure occurs, triggering backtracking to fill in a different integer than previously chosen.
  3. Repeat the previous two steps repeatedly until either:
    1. We find a solution.
    2. We have tried all possible solutions, and all of them violate the Sudoku constraints.

Sudoku is an example of a problem wherein we can write very simple solutions in Prolog thanks to built-in nondeterminism and backtracking, though most of these tend to be unrealistically slow. In this context, “unrealistically slow” literally means that a solver takes days or even weeks to complete, versus only a few hundred milliseconds on an optimized implementation. While we usually don't care about performance in this class, for this portion of the assignment, it is vital for the purpose of being able to test your code.

For full credit, your solution:

In the interest of making your solver fast, we provide some possible ideas for optimization. Many, many other ideas are possible, though we know for a fact that these optimizations can get things speedy, since our own reference implemenation uses them. We list these ideas below:

For the sake of efficiency, your implementation should not simply try all possible combinations. Instead, you should bind one variable at a time, and check that the above three constraints have not been violated. In case of violation, choose a different number. If there are no more numbers to try, backtrack to a previous choice and go from there. Remember to leverage the Prolog engine for this purpose: it can (and should) automatically handle all the backtracking logic. Only solutions which perform the above optimization (or better) will receive full credit. Of interest here is the is_set predicate, which can be used to determine if all the elements of a list are unique, even if that list contains uninstantiated variables.

Rubric

Deliverables

You can turnin all your code with the provided command:
turnin assign5@cs162 companion/*.pl list_routines.pl ontology.pl sudoku.pl