CS24 Week 2 Lab: Multiple Files, Allocation, Structures, and GDB

Due Date: Tuesday, July 8 at 11:59:59 PM

This lab covers memory allocation and structures, along with the use of the GDB debugger. After completing this lab, you should be able to:

Pair Programming

You may work with a partner for this lab. If you choose a partner, both members must contribute to the same work. That is, you cannot simply divide the lab in half, as everyone is responsible for all the content. Instead, you should use a pair programming style. In pair programming, one person acts as a driver, who takes control of the keyboard. The other person acts as a navigator, who reviews what the driver writes and offers advice. Oftentimes, the driver is occupied with details of the problem (e.g., variables, control flow, etc.), whereas the navigator is concerned with bigger-picture problems (e.g., functions, design, interactions). This works well for problems which are simply too large to fit into one person's head. If you're unfamiliar with the concept of pair programming, you may wish to watch this video on the topic.

With pair programming, oftentimes it works best if both participants are of approximately the same skill level. Generally, if the disparity in skill is large, the end result is frustrating for both parties - it really only works when both parties operate at approximately the same speed. With this in mind, you should select someone of approximately the same skill level.

Reading

Because these are review concepts, there are more resources this week than in a normal week. If you already understand the concept, you may skip the reading. All of these concepts are in your CS16 textbook, which you should still possess for reference in this class.

Part A: Splitting Code into Multiple Files

The purpose of this part is to learn how to take some actual code which is defined in one file and split it into multiple files. This is a fairly common task to perform, and oftentimes the goal is to write code across multiple files in the first place.

Step A-1: Create Directories and Copy Files

-bash-4.2$ cd
-bash-4.2$ pwd

/cs/student/yourusername

-bash-4.2$ cd cs24
-bash-4.2$ pwd

/cs/student/yourusername/cs24
-bash-4.2$ mkdir lab2
-bash-4.2$ cd lab2
-bash-4.2$ pwd

/cs/student/yourusername/cs24/lab2

Now copy over the files for this week. They are kept in the directory: /cs/student/kyledewey/cs24/labs/2

-bash-4.2$ cp /cs/student/kyledewey/cs24/labs/2/* .

Note that there is a '.' at the end of that line. cp means copy. /cs/student/kyledewey/cs24/labs/2/* means all of the files contained in the directory /cs/student/kyledewey/cs24/labs/2. The '.' means to copy from that location to the current location.

Now check to make sure the copy worked correctly:

-bash-4.2$ ls

You should see a set of files listed that will be used for this warm-up.

Step A-2: Split the Files

Your first task is to split up the code that will be used for problem.c into three files - problemmain.c, problemhelpers.c, and problemhelpers.h. Instructions for this are presented in a per-file fashion.

File problemmain.c

You will not be able to compile this until after you have completed the header file (problemhelpers.h).

File problemhelpers.h

Now you are ready to compile just the main file. To compile just one file of a programming spanning across multiple files, you use the -c flag, like so:

-bash-4.2$ clang -c problemmain.c

Notice with the command above, problemhelpers.h was not included on the command line. This is intentional. Recall from lecture that header files are copied-and-pasted as-is where #include lines are, so their contents will be automatically included where needed in problemmain.c. Intuitively, header files contain declarations and not implementations, and so they should not be included in the compilation line.

The result should compile without errors or warnings. If you do run into errors or warnings, be sure to:

If this is insufficient to figure it out, ask the TA or the instructor. Recall that you may not show your code to other students, though you are free to ask others about particular error messages (just not the code that goes along with them).

Once you have the code compiling, you will still be unable to run it, since not everything has been compiled along with it. That is, we have compiled the declarations needed for problemmain.c to work, but we haven't yet implemented the code behind those declarations. These declarations will be implemented later, but at the moment the compiler has generated an object file named problemmain.o which contains the compiled version of just this portion.

File problemhelpers.c

Now compile it and see if it compiles:

-bash-4.2$ clang -c problemhelpers.c
If there are errors, follow the same steps as above.

Once you have everything compiling individually, it's time to put it all together into a single executable, like so:

-bash-4.2$ clang problemmain.o problemhelpers.o
You can compare the result to the executable created when you compile the original code, like so:
-bash-4.2$ clang problem.c

Part B: GDB Tutorial

Because we use pointers and dynamic memory allocation in this course, you will find that segmentation faults and other memory-related errors coming up often. Oftentimes the fastest way to diagnose and solve these problems is by using GDB. You should follow this step very carefully, since you are expected to be able to replicate these steps in order to debug your own programs you will later write in this class.

You have been provided with a program that does the following, in order:

  1. Prints out the command-line arguments it was provided
  2. Adds ".txt" to the end of each argument
  3. Prints out the arguments again

There is a memory-related bug in this program, and you will be stepping through GDB to find this bug and correct it. The important bit with this part is the process used to find and fix the bug.

Step B-1: Readings

You should first peruse the readings below:

Step B-2: Inspect the Code in gdbtutorial.c

First print gdbtutorial.c to the screen so that you can look at it:

-bash-4.2$ cat gdbtutorial.c

Note that the program calls the strcat function, which is not defined in the file. This function is part of one of the #includeed files. To remind yourself of what strcat does, type the following:

-bash-4.2$ man strcat

The output of man (short for manual) tells you several things, including:

Step B-3: Find the Bug

First, compile the file. We are using the GCC compiler for this given its tight association with GDB, though this could be done with clang as well. The compilation line used is shown below:

-bash-4.2$ gcc -o nondebugmain gdbtutorial.c

Note that with the line above, the -o option tells GCC to make an executable named nondebugmain as opposed to the default (and far less informative) a.out.

Okay, we're ready to start testing it! You need to start with the executable name and then give it some command-line arguments. For example, you could type something like this.

-bash-4.2$ ./nondebugmain something gdb program1

How about this:

-bash-4.2$ ./nondebugmain programming is rewarding

Something very different from what is intended happens. Think very carefully about the difference between what you think should happen and what is actually happening. Because this involved arrays, and the memory is displaying something very different from what was intended, we need to inspect more closely. We need to use GDB! I'm going to step you through this process as if I was debugging this program with no idea what is wrong. I will also explain why I am choosing to use the different commands I am using. For more information on the commands, you might want to reconsult the readings in the first step of this part.

In order to compile this code for use in GDB you will need to add the -g flag. This flag tells the compiler to include extra debugging symbols in the executable, so that GDB has more information while it's running. For example, these symbols include the line numbers for different portions of the code, which are normally irrelevant to execution. The command to do this follows:

-bash-4.2$ gcc -o debugmain gdbtutorial.c -g

Now we can use GDB on the resulting executable, like so:

-bash-4.2$ gdb debugmain
GNU gdb (GDB) Fedora (7.1-34.fc13)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /cs/faculty/franklin/labfiles/24/week1/debugmain...done.
(gdb) 

If there had been a segmentation fault, I would have just run the program, waited for it to stop, and looked at where we were in the code. Instead, I need to decide where I should start looking at the code. The printfs indicate that the first loop works just fine. The error is clearly present by the time the first iteration of the third loop is encountered. This means the problem must be in the second loop. GDB is only useful if you spend the time to reason about where the problem is likely to be. Now that we have identified the likely culprit to be the second loop, we set a breakpoint there to stop execution so that we can inspect the variable values during this loop.

We need to know what line number that second loop is. You can find that out in an editor if you already know how. If not, we can use the list command in GDB.

(gdb) list
3       #include 
4
5       void main(int argc, char **argv)
6       {
7
8               int i;
9               int size;
10
11              // print out each command-line argument on a separate line.
12              for(i=0;i<argc;i++)
(gdb) 

Hmmm. That didn't quite list it, but almost. So let's list a few lines later. Type the following:

(gdb)  list 15 

Check your understanding: What lines of code are printed out with the command list 15?

Looking at the line numbers on my screen, the strcat command occurs on line 17. So that is where I'll put my breakpoint:

(gdb)  break 17 

Now we are ready to run GDB:

(gdb)  run hello why is this broken 

This will start the program, give it “hello why is this broken” as the extra command-line arguments, and execute it until the breakpoint is reached:

Starting program: /cs/faculty/franklin/public_html/24/week1/debugmain hello why is this broken
Command-line argument 0: /cs/faculty/franklin/public_html/24/week1/debugmain
Command-line argument 1: hello
Command-line argument 2: why
Command-line argument 3: is
Command-line argument 4: this
Command-line argument 5: broken

Breakpoint 1, main (argc=6, argv=0xbffff4f4) at gdbtutorial.c:17
17                      strcat(argv[i],".txt");
(gdb) 

Let's print out the argv values to see if they're still okay:

(gdb)print argv[0]
$2 = 0xbffff654 "/cs/faculty/franklin/public_html/24/week1/debugmain"
(gdb)print argv[1]
$3 = 0xbffff68d "hello"
(gdb)print argv[2]
$4 = 0xbffff693 "why"
(gdb)print argv[3]
$5 = 0xbffff697 "is"
(gdb)print argv[4]
$6 = 0xbffff69a "this"
(gdb)print argv[5]
$7 = 0xbffff69f "broken"
(gdb) 

Notice that print prints out two things - a hexadecimal number and a string. The hexadecimal number is the value of the pointer - or the address of the beginning of the string. Notice the relationship between the location of argv[1], argv[2], argv[3], etc. What does the memory look like?

So far, so good. Let's continue running until we hit that line again and check again.

(gdb)continue

When it stops, print everything out again.

(gdb)print argv[0]
$8 = 0xbffff654 "/cs/faculty/franklin/public_html/24/labs/lab01/debugmain.txt"
(gdb)print argv[1]
$9 = 0xbffff68d "txt"
(gdb)print argv[2]
$10 = 0xbffff693 "why"
(gdb)print argv[3]
$11 = 0xbffff697 "is"
(gdb)print argv[4]
$12 = 0xbffff69a "this"
(gdb)print argv[5]
$13 = 0xbffff69f "broken"
(gdb) 

Uh, oh. When that .txt was added to the end of argv[0], it also affected argv[1]. Let's look at the addresses. Did argv[1] change location? Nope, it is still the same value as it was when we printed it out before. That means that, somehow, the characters inside argv[1] got changed. If you look closely at the locations, you'll notice that all of the argv strings are located right next to each other, with no extra space in between. Yet we attempted to add ".txt" after the end of the first one. There was no space allocated for it, so it ended up overwriting the second string. The second string started at the "t" in the first string. That means the first four characters of argv[1] must be 't', 'x', 't', '\0'. But the fifth character should still be the 'o' from Hello. Let's see if that is true:

(gdb)print argv[1][4]
$14 = 111 'o'

What did we learn? Two things:

  1. strcat does not allocate space or check to make sure there is enough space for the resulting string.
  2. We need to verify there is enough space or allocate a new string for the result of a strcat operation.

This means our fundamental problem was that we added to a string without allocating space for it. This means that in the second loop, we can't just call strcat. It will take several steps, then, since we can't just make the string bigger. We'll need to allocate additional room.

Step B-4: Fix the Bug

Now you need to change the second for loop so that it executes several lines of code each iteration. There are several ways to do this, depending on which call you use to get the new memory. Here are the steps you would take, assuming you only know how to use the malloc command. You may use the string library to help solve this portion. Detailed steps specifically for the malloc function are below:

If you aren't sure about what one of the aforementioned string commands does, use the man command to read about it.

Note that at it seems like there is a step missing. It seems like you should deallocate the old string. It turns out that these strings were not allocated using malloc, so they may not be deallocated using free. In general, you need to be careful about deallocating memory that you didn't allocate. So since you didn't do the initial allocation, you are not responsible for (and could break the program by) deallocating the memory.

Once the program works as intended, you have completed this part.

Part C: Implement Helper Functions

Setup

For this portion of the assignment, you must implement a series of functions in order to make a program that manipulates a virtual deck of cards. This is intensive with dynamic memory allocation, structs, and pointers. Do not hesitate to ask questions - some of this content may be new to you, and this serves as the basis for later materials in this course.

The grading is structured the same way as if you had a real job, and I was your manager. If you came to me saying you had done a lot of work on the program, and it worked partially yesterday, but for today's meeting nothing works, I would be very unimpressed. What you would need to do for the meeting is to make sure that you have something that demonstrates the parts you completed, not the parts you attempted. The only way to receive any credit for a part is for it to demonstrate its functionality during testing. You will receive no credit for a program with lots of code that does not display parts that work. This means that it is especially critical to make sure your program compiles and does not exit early, because then the additional functionality you implemented does not get executed. When you complete each part, make sure you test it thoroughly so that it shows that it works. Then make sure you back it up so that if you make further changes that break your code, you still have something that works. In addition, to receive full credit, you must perform error-checking. So, to receive the most credit, you should follow these principles:

In order to get you used to this good habit, I have structured this assignment to strongly encourage you to do so. If you follow this assignment from beginning to end, testing as you go along, you will maximize your credit for the exam. If, instead, you try to program the whole thing and turn in something that is broken, you will be graded very harshly for both not following directions and not completing the assignment. Something is only complete if it works.

Specifics

In this assignment, you are going to practice the use and allocation of structs. We have two main structs: a card and a deck of cards.

The card has two member variables: a suit and a value. There are four suits: clubs, diamonds, hearts, and spades. As with most things in computer science, it is most efficient to store the suit as a number, not a string. It is especially important for cards because the suits have relative values - some suits are worth more than other suits (when the value is the same). So, in order to perform comparisons between cards, we will assign the following values to the suits: clubs == 0, diamonds == 1, hearts == 2, and spades == 3.

Cards can have numbers or pictures. Like the suit, though, we need to assign a number to every card that corresponds to its relative value. Here is a table of the name of each picture card, the value, and the abbreviation we will use when it prints out. We are printing out a single character for each card. The cards 2 through 9 have 2 through 9 as their name, value, and what gets printed out, so we are omitting them from the table.

Ace 1 A
10 10 T
Jack 11 J
Queen 12 Q
King 13 K

Step C-1: Implement make_card

This function has three jobs:

  1. Allocate space for one card
  2. Set the values in the card to the values in the input arguments
  3. Return a pointer to the new card so that it may be used outside this function

Step C-2: Implement compare

This function compares the values of two cards, like < and > does for numbers. For structs, the compare function is more complicated, so we need to write the code for it. In the case of the cards, we first compare the value of the cards. The value determines which is larger. If the values are the same, then the suit is used. The higher value of suit is the higher value of the overall card.

If one input is NULL but the other is not NULL, then consider the one that is not NULL to be larger. If they are both NULL, then return 0.

Step C-3: Implement find_index

The find_index function takes a deck that has already been created and looks for a particular card that is in it. Think carefully about the struct in a struct. You receive a deck, and that has a card* in it (treated like an array of cards). You need to look closely at the syntax for members of the deck as well as members of the cards (stored in the card array).

Step C-4: Implement make_standard_deck

This function creates an entire deck of cards. A deck of cards has Aces through Kings for all four suits. There are no wildcards or jokers in this deck. There will be 52 cards in the deck.

You need to allocate space two times - once to allocate space for the deck struct, and once to allocate space for the array of cards.

There is a print_card there, but that is already implemented for you. Feel free to look at the implementation to help you understand how to access structs.

Step C-5: Testing

Now you need to compile and test your code. You can compile separately or together. You only need to compile a file you changed. So if you made no changes to problemmain.c, there is no need to recompile it. Below are the commands for separate compilation:

-bash-4.2$ clang -c problemmain.c
-bash-4.2$ clang -c problemhelpers.c
-bash-4.2$ clang problemmain.o problemhelpers.o

To compile everything all at once:

-bash-4.2$ clang problemmain.c problemhelpers.c

Test Cases

The main file does not contain an exhaustive set of test cases. Once you get that very small set of test cases running, then you need to sit back and think about test cases. Remember the ways to generate test cases from the previous lab. Look at each individual function and think about different variations of normal inputs as well as unexpected inputs (NULL pointers, values out of range, suits out of range, etc.). If you receive an unexpected input, DO NOT PRINT ANYTHING OUT. Instead, return NULL for anything returning a pointer, -1 for returning an index. What to do for compare is described above, so you should not add any additional validity-checking code to compare. Make sure to add your own test cases - we will be testing for all sorts of error conditions, and your code is expected to work properly in these circumstances. To be clear, if any part of an input is invalid, you should return the appropriate error-signaling value. For example, if a deck contains even a single card which has an unexpected suit (for example), then the whole input is considered invalid.

Submitting your Work

First, fill out the required fields in the provided README.txt file. Once you have filled this in, you may submit everything with the following command:

-bash-4.2$ turnin lab2@cs24 README.txt gdbtutorial.c problemhelpers.h problemhelpers.c problemmain.c

Acknowledgements

All content was graciously provided by Professor Diana Franklin, with slight formatting-related adaptation.