Harvard CS50 Week4 Review
Welcome to CS50 again. This is Week 4 and I will go through my review of this week’s content.
Lecture
malloc()
is just memory allocate, which will use the memory from the heap. And heap will grow downwards. There is also another section called stack, and function calls and the variables used will be stored in the stack. After the calling, the memory will be freed. Due to this reason, when we pass parameters by value, we just pass the copy and it won’t affect the original value since after the function call, the memory will be freed. To affect the original value, we need to pass the address of the parameters.- Use the idea of point 1, we can understand the why in
scanf()
, we need to pass the address of the variable we want to store the input. That’s because if we don’t pass the address, we can’t change the variable, which means we can’t store our input into the variable. For example,// n won't store the number input int n; scanf("%i", n); // only when you pass in the address can you change the n scanf("%i", &n);
To understand it, we can use pass by reference when we want to change the parameters we passed in the function.
// The swap() that doesn't work void swap(int a, int b) { int tmp = a; a = b; b = tmp; }
// The swap() that works void swap(int *a, int *b) { int tmp = *a; *a = *b; *b = tmp; }
- When we pass the variable in the
scanf()
, the variable must be initialized. For data type likeint
, it’s suffice to just declare the variable. But for data type like pointers, declaration only is not suffice, initialization is also needed. For example// int type int n; scanf("%i", &n); // pointers type - not working char *p; scanf("%s", p); // working char p[4]; scanf("%s", p);
For data type like
int
, it’s okay for you not to initialize since it always has valid space, so we can always rewrite the value in that valid space safely. But for pointers, not initializing is dangerous because what p stores is a garbage value (If you don’t initialize after declaring variables, what they have are all garbage values). This garbage value may represent an invalid address in the memory, so you can not write into that address. - When using
scanf()
to read string, scanf won’t stop, even if you have specified the length of the string. This is dangerous. But whatget_string()
incs50.h
does is to usemalloc()
repeatedly to allocate more and more memory space to get the string. - In the lecture, I really apprecaite Prof Davaid’s explanation about
*
(dereferencing). It can be viewed as “go to”, for exampleint *p, n; n = 1; p = &n; printf("%i", *p);
In line 4,
*p
can be viewed as going to the value store inp
, which is an address and get the value stored in that address.
Section
- A pointer usually stores address. For example,
int n = 1; int *p = &n;
In this code snippet,
p
is the name for the pointer andint *
means the type is a pointer that points toint
. - Some key syntax about pointer:
- type * is a pointer that stores the address of a type
- *x takes a pointer x and gets the value stored at that address
- &x takes x and gets its address
fread()
explanation
It takes four arguments, To where (an address), what size (the size of a chunk), how many (how many chunkcs to read at a time), from where (an address). For example,string filename = argv[1]; FILE *f = fopen(filename, "r"); uint8_t buffer[4]; fread(buffer, 1, 4, f);
In this code snippet, the size of the chunk is one byte, and we want to read 4 chunks at a time from the file and read them to the
uint8_t
array called buffer.
Shorts
Pointers
- In C,
NULL
pointer is the simplest pointer available to us. And theNULL
pointer points to nothing. - When you create a pointer and you don’t set its value immediately, you should always set the value of the pointer to NULL.
- When you dereference a pointer that is NULL, you will get a segmentation fault.
- To define three pointers using one line, you should write
// Correct way int *a, *b, *c; // Wrong way int *a, b, c;
- The size of a pointer is either $4$ or $8$ byte, depending on the system you are using. For example, in a $32$-bit system, the size of a pointer is $4$ byte, and in a $64$-bit system, the size of a pointer is $8$ byte.
Defining Custom Data Types
- In C, we use
typedef
to create a shorthand or rewritten name for data types. It’s structure is below, which is very very important.typedef <old name> <new name>
Dynamic Memory Allocation
- Statically and Dynamically
// statically obtain an integer int x; // dynamically obtain an integer int *px = malloc(sizeof(int));
In this code,
x
is stored on the stack. The pointer variablepx
is also stored on the stack but the memeory it points to is a block of memory on the heap. To further understand this, see this example,// array of floats on the stack float stack_array[x]; // array of floats on the heap float *heap_array = malloc(x * sizeof(float));
- Dynamically-allocated memory is not automatically returned to the system after finishing using it, so we need to use
free()
to free the dynamically-allocated memory. Otherwise, there will be a memory-leak. However, statically-allocated memory doesn’t have this issue since it’s stored on the stack and the memory will be freed after the function call.
Call Stack
- Stack frame
When you call a function, the system sets aside space in memory for that function to do its necessary work.- We frequently call such chunk of memory stack frames or function frames (Not stack only).
- More than one function’s stack frame may exist in memory at a given time.
- For example, if
main()
callsmove()
, which then callsdirection()
, all three functions have open frames. - These frames are arranged in a stack. The frame for the most-recently called function is always on the top of the stack.
- When a new function is called, a new frame is pushed onto the top of the stack and becomes the active frame.
- When a function finishes, its frame is popped off the stack, and the frame below it becomes the active frame again.
- For example, if
File Pointers
fopen()
// Usage FILE *ptr = fopen(<filename>, <operation>) // Operation: read FILE *ptr = fopen("file1.txt", "r"); // Operation: write FILE *ptr = fopen("file2.txt", "w"); // Operation: append FILE *ptr = fopen("file2.txt", "a");
Notice that write means writing from the beginning, which will erase all the existing data in that file. While append means appending at the end of the file, which won’t destroy anything in that file.
fclose()
//Usage fclose(<file pointer>);
fgetc()
// Usage char ch = fgetc(<file pointer>)
Notice that the file pointer must be read mode, otherwise, it will return an error.
- Using
fgetc()
to implementcat
command in Linux, which will print the content of the file.char ch; while ((ch = fgetc(ptr)) != EOF) printf("%c", ch);
- Using
fputc()
// Usage fputc(<character>, <file pointer>)
Notice that the file pointer must be write or append mode, otherwise, it will return an error.
- Using
fputc()
to implementcp
command in Linux, which will copy the content of one file to another file.char ch; while ((ch = fgetc(ptr1)) != EOF) fputc(ch, ptr2);
- Using
fread()
// Usage fread(<buffer>, <size>, <quantity>, <file pointer>) // Example double *arr2 = malloc(sizeof(double) * 80); fread(arr2, sizeof(double), 80, ptr);
- Reads quantity number of size bytes from the file and stores them in the buffer.
- Notice that the file pointer must be read mode, otherwise, it will return an error.
fwrite()
// Usage fwrite(<buffer>, <size>, <quantity>, <file pointer>) // Example double *arr2 = malloc(sizeof(double) * 80); fwrite(arr2, sizeof(double), 80, ptr);
- Writes quantity number of size bytes from the buffer to the file.
- Notice that the file pointer must be write or append mode, otherwise, it will return an error.
Problem Set 4
01 Volume
Things to notice in the problem statement
- The wav file has two parts, one is a 44 byte header, the other is the sample part.
Divide and Conquer
\begin{algorithm} \caption{Volume} \begin{algorithmic} \STATE Copy header from input file to output file \STATE Create a buffer for a single sample \COMMENT{Write updated data to output file} \WHILE{Read a sample from the input file} \STATE Update the sample \STATE Write the updated sample to the output file \ENDWHILE \end{algorithmic} \end{algorithm}
Take-aways
- In
fopen()
andfwrite()
, the opened file “remembers” the number of bytes that were successfully read, such that subsequent calls to this function for stream will return bytes after those already read. fopen()
andfwrite()
will return the number of items read/written.
02 Filter - Easy
Things to notice in the problem statement
- The bmp file is made up of pixels and each pixel is composed of 3 Bytes, one for Red, one for Blue and one for Green.
Divide and Conquer
grayscale()
\begin{algorithm} \caption{Grayscale} \begin{algorithmic} \STATE Loop over all pixels \STATE Take average of red, green and blue \STATE Update pixel values \end{algorithmic} \end{algorithm}
sepia()
\begin{algorithm} \caption{Sepia} \begin{algorithmic} \STATE Loop over all pixels \STATE Compute sepia values \STATE Update pixel with sepia values \end{algorithmic} \end{algorithm}
reflect()
\begin{algorithm} \caption{Reflect} \begin{algorithmic} \STATE Loop over all pixels in the left side \STATE Swap pixels \end{algorithmic} \end{algorithm}
blur()
\begin{algorithm} \caption{Blur} \begin{algorithmic} \STATE Create a copy of image \STATE Loop over all pixels \STATE Compute average of surrounding pixels using copied image \STATE Update pixel with average \end{algorithmic} \end{algorithm}
Useful Snippets
reflect()
// Reflect image horizontally void reflect(int height, int width, RGBTRIPLE image[height][width]) { // Loop over the left side of pixels for (int i = 0; i < height; i++) { for (int j = 0; j < width / 2; j++) { // Swap pixels RGBTRIPLE temp; temp = image[i][j]; image[i][j] = image[i][width - j - 1]; image[i][width - j - 1] = temp; } } return; }
Notice that we only need to iterate over the left side of the pixels. And we cannot let j to be <=
width / 2
since one pixel will be swapped twice as the index starts from 0. Also, the index of the corresponding right pixel is needs another minus 1 fromwidth - j
.blur()
// Blur image void blur(int height, int width, RGBTRIPLE image[height][width]) { // Create a copy of image RGBTRIPLE copy[height][width]; for (int i = 0; i < height; i++) { for (int j = 0; j < width; j++) { copy[i][j] = image[i][j]; } } // Blur for (int i = 0; i < height; i++) { for (int j = 0; j < width; j++) { // Calculate blur double times = 0; double sumRed = 0; double sumBlue = 0; double sumGreen = 0; for (int p = i - 1; p <= i + 1; p++) { for (int q = j - 1; q <= j + 1; q++) { if (p >= 0 && p < height && q >= 0 && q < width) { times++; sumRed += copy[p][q].rgbtRed; sumBlue += copy[p][q].rgbtBlue; sumGreen += copy[p][q].rgbtGreen; } } } double averageRed = round(sumRed / times); double averageBlue = round(sumBlue / times); double averageGreen = round(sumGreen / times); // Update pixels image[i][j].rgbtRed = (int) averageRed; image[i][j].rgbtBlue = (int) averageBlue; image[i][j].rgbtGreen = (int) averageGreen; } } return; }
Notice that the most consice way to calculate the blur is using a nested loop as shown in the code snippet above. Also, since we will update our image, we need to create a copy of the original image to calculate the blur. Otherwise, we may run into the resource conflict.
02 Filter - Hard
Things to notice in the problem statement
- The first three functions (
grayscale()
,reflect()
andblur()
) are the same as the previous question. So, in this part, I will only talk about theedge()
.
Divide and Conquer
edge()
\begin{algorithm} \caption{Edge} \begin{algorithmic} \STATE Create a copy of image \STATE Create a 3x3 array for Gx and Gy \STATE Loop over all pixels \STATE Compute Gx and Gy \STATE Compute G from Gx and Gy \STATE Update pixel with G \end{algorithmic} \end{algorithm}
Useful Snippet
edge()
// Detect edges void edges(int height, int width, RGBTRIPLE image[height][width]) { // Create a copy of image RGBTRIPLE copy[height][width]; for (int i = 0; i < height; i++) { for (int j = 0; j < width; j++) { copy[i][j] = image[i][j]; } } // Create Gx and Gy int Gx[3][3] = {{-1, 0, 1}, {-2, 0, 2}, {-1, 0, 1}}; int Gy[3][3] = {{-1, -2, -1}, {0, 0, 0}, {1, 2, 1}}; // Procedure for (int i = 0; i < height; i++) { for (int j = 0; j < width; j++) { // Calculate Gx and Gy double sumGxRed = 0; double sumGyRed = 0; double sumGxBlue = 0; double sumGyBlue = 0; double sumGxGreen = 0; double sumGyGreen = 0; for (int p = i - 1; p <= i + 1; p++) { for (int q = j - 1; q <= j + 1; q++) { // Index for accessing Gx and Gy int x = p - (i - 1); int y = q - (j - 1); // Calculate the sumGx and sumGy rgb if (p >= 0 && p < height && q >= 0 && q < width) { sumGxRed += copy[p][q].rgbtRed * Gx[x][y]; sumGyRed += copy[p][q].rgbtRed * Gy[x][y]; sumGxBlue += copy[p][q].rgbtBlue * Gx[x][y]; sumGyBlue += copy[p][q].rgbtBlue * Gy[x][y]; sumGxGreen += copy[p][q].rgbtGreen * Gx[x][y]; sumGyGreen += copy[p][q].rgbtGreen * Gy[x][y]; } } } double finalRed = round(sqrt(pow(sumGxRed, 2) + pow(sumGyRed, 2))); double finalBlue = round(sqrt(pow(sumGxBlue, 2) + pow(sumGyBlue, 2))); double finalGreen = round(sqrt(pow(sumGxGreen, 2) + pow(sumGyGreen, 2))); // Update pixels image[i][j].rgbtRed = (finalRed > 255.0) ? 255 : (int) finalRed; image[i][j].rgbtBlue = (finalBlue > 255.0) ? 255 : (int) finalBlue; image[i][j].rgbtGreen = (finalGreen > 255.0) ? 255 : (int) finalGreen; } } return; }
Notice that to use initializer to initialize a multi-dimension array, we need to use
{}
to enclose each row.
03 Recover
Things to notice in the problem statement
- Each JPEG starts with a distinct header
- First three bytes:
0xff
,0xd8
,0xff
- Last Byte:
0xe0
to0xef
- First three bytes:
- JPEGs stored back-to-back in memory card
Divide and Conquer
\begin{algorithm} \caption{Recover} \begin{algorithmic} \STATE Open Memory Card \REPEAT \IF{Start of a new JPEG} \IF{First JPEG} \STATE ... \ELSE \STATE Close the previous file \STATE ... \ENDIF \ELSE \IF{Already found a JPEG} \STATE Write the block to the file \ELSE \STATE Continue \ENDIF \ENDIF \UNTIL{End of card} \STATE Close any remaining files \end{algorithmic} \end{algorithm}
Take-aways
- To make sure a byte start with
e
in this question, we can use the idea of bitwise operation to achieve this. In this problem,(buffer[3] & 0xf0) == 0xe0
can serve this purpose very well. - The File pointer (
FILE *
) is also allocated on the heap, so at the end of the program, remember to close the files you have opened. In this problem, don’t forget to close thecard.raw
.