| Tutor | Hao Ren |
| hao.ren@sydney.edu.au |
- COMP2017 2026 S1 Week 3 Tutorial A
ASCII is a standard table that gives a numeric code to each character. For example, the letter 'A', the digit '0', a space, and many other characters each have a number behind them.
For example:
'0'= 48'1'= 49'9'= 57'A'= 65'Z'= 90'a'= 97'z'= 122' '= 32'\0'= 0
So when you write this in C:
char c = 'A';the computer actually stores a numeric code for that character.
In C, the type char is really a small integer type used to store character codes. That is why characters can be compared and even used in arithmetic expressions.
For example:
printf("%d\n", 'A'); // prints 65 on ASCII-based systems
printf("%d\n", '0'); // prints 48
printf("%d\n", '9'); // prints 57Tip
'0' and 0 are not the same thing.
'0'is the character zero0is the integer zero'\0'is the null character, used to mark the end of a string
This matters a lot in strings.
For example, the string "123" is stored in memory like this:
'1''2''3''\0'
That last '\0' tells C where the string ends.
C does not strictly require ASCII on every machine, but on modern systems it is almost always ASCII or an ASCII-compatible encoding such as UTF-8. More importantly, C guarantees that the digit characters '0' through '9' are consecutive, so the trick s[i] - '0' is valid in standard C.
A C string is not a special built-in "string object." It is just a sequence of char values that ends with a null byte, written as '\0'. A string literal like "cat" has a zero byte appended to it by the compiler.
char s[] = "cat";That is stored like this in memory:
'c' 'a' 't' '\0'That last '\0' is what tells C where the string ends.
This is why string functions keep scanning characters until they reach '\0'.
Warning
The string literal itself should be treated as read-only. The C standard says that attempting to modify the array for a string literal is undefined behavior.
For your information: https://stackoverflow.com/questions/1455970/cannot-modify-c-string
Good beginner rule:
char s[] = "cat"; // okay to modify s
const char *p = "cat"; // point at a literal, treat as read-onlyFirst, a string ends at '\0', not at the size of the array.
Second, the array holding the string must have enough room for the characters and the final '\0'.
So this is correct:
char name[6] = "Alice";because "Alice" needs 5 letters plus 1 null terminator.
This is wrong:
char name[5] = "Alice";because there is no room for '\0'.
You don't need to explicitly declare the string size if you create them when initialize string like this.
char name[] = "Alice";Note
A C string is just a char array that ends with '\0', and <string.h> gives you functions to measure it, compare it, copy it, join it, and search it. We will cover more about C Standard Library in Tutorial B.
Command line arguments are the words you type after the program name when you run a program in the terminal.
For example:
./cla Echo 419 to Cortana, come inHere:
./clais the program- everything after it is a command line argument
So the program receives these arguments:
Echo
419
to
Cortana,
come
in
In C, you usually access them with this form of main:
int main(int argc, char *argv[])You may also see:
int main(int argc, char **argv)argc means "argument count". It tells you how many arguments there are.
argv means "argument vector". It is a list of strings.
The important beginner idea is this:
- one C string is a
char * - command line input is many strings
- so
argvis a list ofchar * - that is why its type is
char **
A nice way to picture it is this:
argv
|
+--> argv[0] --> "./cla"
+--> argv[1] --> "Echo"
+--> argv[2] --> "419"
+--> argv[3] --> "to"
+--> argv[4] --> "Cortana,"
+--> argv[5] --> "come"
+--> argv[6] --> "in"
So if the command is:
./cla Echo 419 to Cortana, come inthen:
argc == 7
argv[0] == "./cla"
argv[1] == "Echo"
argv[2] == "419"
argv[3] == "to"
argv[4] == "Cortana,"
argv[5] == "come"
argv[6] == "in"argv[1] == "Echo"
argv[1][0] == 'E'
argv[1][1] == 'c'A very important detail: argv[0] is usually the program name. That means the user's actual arguments start at argv[1].
That is why programs often loop like this:
for (i = 1; i < argc; i++) {
...
}Important
Refer to cla.c for the code used in this section.
The program should print all arguments given to it, excluding the program name, with spaces between them.
We start at i = 1 because argv[0] is the program name and we do not want to print it.
For each argument, we print:
printf("%s", argv[i]);We use %s because each argv[i] is a string.
Then we print a space after it, except for the last one:
if (i < argc - 1) {
printf(" ");
}That avoids an extra trailing space at the end.
Finally, we print a newline.
Example run:
$ ./cla Echo 419 to Cortana, come in
Echo 419 to Cortana, come inHere are some common mistakes to point out.
- The first is printing
argv[0]by accident. That prints the program name too, which the exercise does not want. - The second is using
%cinstead of%s.argv[i]is a string, not a single character, so%sis correct. - The third is forgetting that the shell splits arguments on spaces. So:
./cla hello worldgives two arguments:
"hello"
"world"
But:
./cla "hello world"gives one argument:
"hello world"
Important
Refer to my_atoi.c for the code used in this section.
Let's first look at a simple case and try to convert the string 2017 into an integer together.
Starting from the beginning of the string, we read the characters one by one. The first character is 2, which represents two thousands, followed by 0 for zero hundreds, 1 for one ten, and 7 for seven. Each digit before the next one is 10 times larger in value, so one idea is to multiply the current number by 10 each time we move to the next digit.
We know that:
2017 = 2 * 1000 + 0 * 100 + 1 * 10 + 7 * 1
So we can use an integer variable to store the result as we build the number step by step.
For "2017":
- start:
n = 0 - read
'2'→ digit = 2 →n = 0 * 10 + 2 = 2 - read
'0'→ digit = 0 →n = 2 * 10 + 0 = 20 - read
'1'→ digit = 1 →n = 20 * 10 + 1 = 201 - read
'7'→ digit = 7 →n = 201 * 10 + 7 = 2017
However, how can we convert each character into a digit? By considering how characters are encoded in the system, we can use s[i] - '0', where s is the string and i is the current index. This gives us the numeric offset from the character '0', which corresponds exactly to the digit value.
Therefore, we can write:
int atoi(const char s[]) {
int i = 0;
int n = 0;
/* Read digits and build the number */
while (s[i] != '\0') {
n = n * 10 + (s[i] - '0');
i++;
}
return n;
}What happens if we try the following in the main function?
printf("%d\n", atoi("a"));
printf("%d\n", atoi("abc"));Try it yourself.
printf("%d\n", atoi("abc")); // 5451
5451 = 49 * 100 + 50 * 10 + 51
This happens because a, b, and c each have integer values in the character encoding table. However, this is not the behavior we expect. We need to check whether the character is a digit.
Therefore, we can update the while condition to ensure we only process digits:
int atoi(const char s[]) {
int i = 0;
int n = 0;
/* Read digits and build the number */
while (s[i] >= '0' && s[i] <= '9') {
n = n * 10 + (s[i] - '0');
i++;
}
return n;
}What happens if the string starts with a sign (+ or -)? In this case, the while condition is not satisfied, so the entire loop would be skipped. Therefore, we need to add functionality to handle the sign.
We can add a preprocessing step to check for a sign and move the index forward if one is found.
int atoi(const char s[]) {
int i = 0;
int sign = 1;
int n = 0;
/* Check sign */
if (s[i] == '-') {
sign = -1;
i++;
} else if (s[i] == '+') {
i++;
}
/* Read digits and build the number */
while (s[i] >= '0' && s[i] <= '9') {
n = n * 10 + (s[i] - '0');
i++;
}
return sign * n;
}Our code looks good so far, but it still cannot handle some "special" characters such as leading whitespace. Therefore, we need one more improvement.
int atoi(const char s[]) {
int i = 0;
int sign = 1;
int n = 0;
/* Skip leading whitespace */
while (s[i] == ' ' || s[i] == '\t' || s[i] == '\n') {
i++;
}
/* Check sign */
if (s[i] == '-') {
sign = -1;
i++;
} else if (s[i] == '+') {
i++;
}
/* Read digits and build the number */
while (s[i] >= '0' && s[i] <= '9') {
n = n * 10 + (s[i] - '0');
i++;
}
return sign * n;
}Important
Refer to reverse.c for the code used in this section.
The task is to read input one whole line at a time, reverse the characters in that line, and print the reversed line. The important detail is that this is a character-by-character reversal, not a word-by-word reversal.
For example,
abc 123 // Input
321 cba // Output
scanf("%s", line) stops reading at the first whitespace character. In C, whitespace includes:
- space
' ' - tab
'\t' - newline
'\n'
That is what isspace is about. The function isspace from <ctype.h> checks whether a character is whitespace.
For example:
isspace(' ') // true
isspace('\t') // true
isspace('\n') // trueSo if the input is:
abc 123
then scanf("%s", line) reads only:
abc
and stops before the space.
But fgets reads the whole line, including spaces, and usually also keeps the newline character '\n'. That is exactly what we want.
Suppose the input line is:
abc 123
After fgets, the array looks like this:
'a' 'b' 'c' ' ' '1' '2' '3' '\n' '\0'Two special characters matter here:
'\n'is the newline at the end of the line'\0'is the null terminator that marks the end of the C string
For this exercise, we usually want to reverse the visible part:
abc 123
but keep the newline '\n' at the end.
Otherwise, if we reverse '\n' too, it moves to the front and the output formatting becomes strange.
The easiest way to reverse a string in place is to use two positions:
- one starting at the front
- one starting at the back
Then swap those characters, move inward, and repeat.
For "abc 123":
- swap
'a'and'3' - swap
'b'and'2' - swap
'c'and'1'
The middle space stays where it ends up naturally.
- Read one line using
fgets. - Find the length of the string.
- If the last real character is
'\n', do not reverse that character. - Set:
ito the start of the linejto the last character to reverse
- Swap
s[i]ands[j]. - Move
iforward andjbackward. - Repeat until
i >= j. - Print the line.
- Keep going until
fgetsreaches end-of-file.
Important
Refer to my_itoa.c for the code used in this section.
itoa is the mirror image of atoi, and it convert the integer to a string.
Note
itoa is commonly taught and sometimes provided by compilers, but it is not a standard C library function. That is why in teaching code it is better to write your own version, such as my_itoa.
For atoi, you convert a digit character into a number like this:
digit = s[i] - '0';For itoa, you convert a number into a digit character like this:
ch = digit + '0';itoa is a little trickier than atoi because the easiest way to get the digits of a number is from right to left.
Take 1234:
1234 % 10gives41234 / 10gives123123 % 10gives312 % 10gives21 % 10gives1
So we have a step-by-step attempt:
- Remember whether the number is negative.
- Work with its positive magnitude.
- Take the last digit using
% 10. - Convert that digit into a character using
+ '0'. - Store that character in the string.
- Remove the last digit using
/ 10. - Repeat until there are no digits left.
- Add
'-'if the original number was negative. - Add the string terminator
'\0'. - Reverse the string.
A very important beginner point is that itoa cannot just "return a string" the same easy way atoi returns an int. In C, strings are arrays of characters, so the function needs a character array to write into.
This exercise combines three earlier ideas very neatly: command-line arguments, fgets, and string searching.
When you run
./wordsearch Sushi < restaurants.txtthe shell does two things for your program:
argv[1]becomes"Sushi", so that is the word you are looking for.stdinis no longer the keyboard. Because of< restaurants.txt, standard input now comes from the file. That means you do not needfopenfor this version. You can just keep reading fromstdinwithfgets.
So we have a basic plan looks like:
- Read the search word from the command line.
- Read one line at a time with
fgets. - Remove the trailing newline
'\n'from the line. - Try every possible starting position in the line.
- At each position, copy out a chunk the same length as the search word.
- Use
string_compareto compare that chunk with the search word. - If they match, print the line with
Found:.
int string_compare(const char s1[], const char s2[]) {
int i = 0;
while (s1[i] == s2[i]) {
if (s1[i] == '\0') {
return 0; /* equal */
}
i++;
}
return s1[i] - s2[i];
}int string_length(const char s[]) {
int i = 0;
while (s[i] != '\0') {
i++;
}
return i;
}Here is the idea behind contains_word.
Suppose the line is:
157 Redfern St, Sushi Topia
and the word is:
Sushi
The program tries the word starting at every position in the line.
First it checks something like:
157 R
Then:
57 Re
Then:
7 Red
and so on.
Eventually it reaches:
Sushi
At that point string_compare(part, word) says they are equal, so the line should be printed.
That is the main search pattern: "try every starting position."
void copy_part(const char src[], int start, int len, char dest[]) {
int i;
for (i = 0; i < len && src[start + i] != '\0'; i++) {
dest[i] = src[start + i];
}
dest[i] = '\0';
}
int contains_word(const char line[], const char word[]) {
int i;
int word_len = string_length(word);
char part[MAX_LINE];
if (word_len == 0) {
return 1;
}
if (word_len >= MAX_LINE) {
return 0;
}
for (i = 0; line[i] != '\0'; i++) {
copy_part(line, i, word_len, part);
if (string_compare(part, word) == 0) {
return 1;
}
}
return 0;
}Important
Refer to wordsearch.c for the code used in this section.
int main(int argc, char *argv[]) {
char line[MAX_LINE];
char *word;
if (argc != 2) {
printf("Usage: %s word\n", argv[0]);
return 1;
}
word = argv[1];
while (fgets(line, sizeof line, stdin) != NULL) {
trim_newline(line);
if (contains_word(line, word)) {
printf("Found: %s\n", line);
}
}
return 0;
}trim_newline(line) removes the '\n' that fgets usually keeps. That makes printing easier. Without trimming, this:
printf("Found: %s\n", line);would usually print an extra blank line, because line already ends with '\n'.
void trim_newline(char s[]) {
int i = 0;
while (s[i] != '\0') {
if (s[i] == '\n') {
s[i] = '\0';
return;
}
i++;
}
}A few parts are especially important.
fgets(line, sizeof line, stdin) reads a whole line, including spaces. That is why it works for restaurant addresses and names like "Sushi Topia". If you used scanf("%s", line), it would stop at the first space and break the exercise.
copy_part builds a small temporary string from the line so that string_compare can compare whole strings. That is the key trick that lets you reuse last week’s comparison function.
Warning
A common mistake is using == directly on strings, like this:
if (part == word)That does not compare the contents of the strings. It only compares addresses. In C, strings must be compared character by character, which is why you need string_compare.
Another classic mistake is forgetting the null terminator in the temporary substring:
dest[i] = '\0';Without that line, part is not a proper C string, and string_compare may keep reading garbage beyond the intended end.