CSE333(0-3): Systems Programming
Apart from CS106L, CSE333 is another amazing course(@ University of Washington Computer Science & Engineering) for beginners like me.
These days it’s rainy, and everyone works from home. When people work from home, they may want a cat to work with them. However, buying a Cat
is too expensive for poor fabbit T.T
Alternatively, it’s good to learn more things starting with ‘C’, like ‘C’, ‘C++’, or ‘CSE’, which all looks similar to Cat
. š±
CSE333: Systems Programming
Catalog Description
Includes substantial programming experience in languages that expose machine characteristics and low-level data representation (e.g., C and C++); explicit memory management; interacting with operating-system services; and cache-aware programming. Prerequisite: CSE 351.
Compile C Program
Hello World in C
#include <stdio.h> // for printf()
#include <stdlib.h> // for EXIT_SUCCESS
int main(int argc, char** argv) {
printf("Hello, World!\n");
return EXIT_SUCCESS;
}
Compile:
gcc -Wall -g -std=c11 -o hello helloword.c
./hello
Lecture 1: Intro, C Refresher
- Workflow, Variables, Functions
C to Machine Code
C .c file -> Assembly .s file -> Machine Code .o file
Errors and Exceptions (ugly and inelegant)
- C does not have exception handling (no try/catch)
- Errors are returned as integer error codes from functions
segmentation fault
is the āgoodā option
How C is different from Java
- Casting
- Java enforces type safety, C does not
- Arrays
- Not objects, donāt know their own length, no bounds checking
- Memory Management
- Manual (malloc/free), no garbage collection
Primitive Types in C
C99 Extended Integer Types
#include <stdint.h>
void foo(void) {
int8_t a; // exactly 8 bits, signed
int16_t b; // exactly 16 bits, signed
int32_t c; // exactly 32 bits, signed
int64_t d; // exactly 64 bits, signed
uint8_t w; // exactly 8 bits, unsigned
...
}
Basic Data Structures
C does not support objects
- Arrays are contiguous chunks of memory
- Arrays have no methods and do not know their own length
- Can easily run off ends of arrays in C ā
security bugs
!!!
- Strings are null-terminated char arrays
- Strings have no methods, but string.h has helpful utilities
char* x = "hello\n";
- - - - - -- ----
x | h | e | l | l | o | \n | \0 |
- - - - - -- ----
- Structs are the most object-like feature, but are just collections of fields ā no āmethodsā or functions
Function
Function Definitions
Generic format:
returnType fname(type param1, ā¦, type paramN) {
// statements
}
Function Ordering
You shouldnāt call a function that hasnāt been declared yet.
- Reverse Ordering
- Function Declaration
#include <stdio.h>
int sumTo(int); // func prototype
int main(int argc, char** argv) {
printf("sumTo(5) is: %d\n", sumTo(5));
return 0;
}
// sum of integers from 1 to max
int sumTo(int max) {
int i, sum = 0;
for (i = 1; i <= max; i++) {
sum += i;
}
return sum;
}
Function Declaration vs. Definition
- Definition:
- Must be exactly one definition of each thing (no duplicates)
- Declaration:
- function prototype, external variable declaration
- Needs to appear in all files that use that thing
void sumstore(int x, int y, int* dest);
Compiling Multi-file Programs
The linker combines multiple object files plus staticallylinked libraries to produce an executable.
- A library is just a pre-assembled collection of .o files
Lecture 2: Memory, Arrays
OS and Processes
- There are many processes running on OS.
- The OS lets you run multiple applications at once.
- An application runs within an OS “process”.
- The OS timeslices each CPU between runnable processes
Processes and Virtual Memory
The OS gives each process the illusion of its own private memory
- Called the processā address space
- Contains the processā virtual memory, visible only to it (via translation)
- 2^64 bytes on a 64-bit machine
- Virtual Memory: From 0x00..00 to 0xFF..FF, contains code, data, libraries, stack, etc.
Loading
When OS loads a program, it:
- create an address space
- inspects the executable file to see what’s inside
- Lazily copies regions of the file into the right place in the address space
- Does any final linking, relocation, or other needed preparation.
0xFF..FF | Stack <- -> Shared Libraries <- Heap | Read/Write Segment .data, .bss | Read-Only Segment .text, .rodata | 0x00..00
Memory Management
- Local variables on the Stack
- Allocated and freed via calling conventions (push, pop, mov)
- Global and static variables in Data
- Allocated/freed when the process starts/exits
- Dynamically-allocated data on the Heap
- malloc() to request; free() to free, otherwise
memory leak
- malloc() to request; free() to free, otherwise
Stack
- Used to store data associated with function calls
- Compiler-inserted code manages stack frames for you
Stack frame (x86-64) includes:
- Address to return to
- Saved registers
- Based on calling conventions
- Local variables
- Argument build
- Only if > 6 used
%rbp %rsp
ā ā
|-------| Arg 7+ | Rt Addr | Old %rbp | Saved Regs + Local vars | Args 7+ |
|<-Caller frame->|<----------------------Callee Frame-------------------->|
Pointers (type *p
)
Pointers are Variables that store addresses.
- It points to somewhere in the processā virtual address space
&foo
produces the virtual address of foo- Dereference a pointer using the unary * operator
- Access the memory referred to by a pointer
Address Space Layout Randomization
Linux uses address space layout randomization (ASLR) for added security.
- Randomizes:
- Base of stack
- Shared library (mmap) location
- Makes Stack-based buffer overflow attacks tougher
- Makes debugging tougher
- Can be disabled (gdb does this by default)
Arrays
- Definition: Allocates size * (size of type) bytes of contiguous memory
type name[size]
- Size of an array is unknown, not stored anywhere.
sizeof(array)
only works in variable scope of array definition- Initialization:
type name[size] = {val0, ..., valN};
- Array name (by itself) evaluates to the address of the start of the array
Multi-dimensional Arrays
Generic 2D format:
type name[rows][cols] = {{values},ā¦,{values}};
- Still allocates a single, contiguous chunk of memory
- C is row-major
- 2-D arrays normally only useful if size known in advance. Otherwise use dynamically-allocated data and pointers (later)
Arrays as Parameters
- Solution 1: Declare Array Size
int sumAll(int a[5]); // prototype
- Solution 2: Pass Size as Parameter
int sumAll(int a[], int size);
Returning an Array
-
Canāt safely return local arrays from functions
-
Solution: Output Parameter Pass it as an output parameter to copyarray()
void copyArray(int src[], int dst[], int size) { //dst[] is the output parameter
int i;
for (i = 0; i < size; i++) {
dst[i] = src[i];
}
}
Output parameters are common in library functions:
long int strtol(char* str, char** endptr, int base);
int sscanf(char* str, char* format, ...);
Array: Call by value
Technical answer: a T[] array parameter is āpromotedā to a pointer of type T*
, and the pointer is passed by value
-
So it acts like a call-by-reference array (if callee changes the array parameter elements it changes the callerās array)
-
But itās really a call-by-value pointer (the callee can change the pointer parameter to point to something else)
void copyArray(int src[], int dst[], int size) {
int i;
dst = src; // evil!
for (i = 0; i < size; i++) {
dst[i] = src[i]; // copies source array to itself!
}
}
Lecture 3: Pointers
Pointers & Pointer Arithmatic
-
Pointers are typed
- Tells the compiler the size of the data you are pointing to
- Exception: void* is a generic pointer (i.e. a placeholder)
-
Pointer arithmetic is scaled by sizeof
(*p)
- Works nicely for arrays
- Does not work on
void*
, since void doesnāt have a size!
-
Valid pointer arithmetic:
- Add/subtract an integer to a pointer
- Subtract two pointers (within stack frame or malloc block)
- Compare pointers (<, <=, ==, !=, >, >=), including NULL
Endianness
4-byte data 0xa1b2c3d4 at address 0x100
- Big-Endian: a1b2c3d4
- Little-Endian: d4c3b2a1
Pointers and Parameters
C is Call-By-Value:
- Callee receives a local copy of the argument, Register or Stack
- If the callee modifies a parameter, the callerās copy isnāt modified
How to swap: Faking Call-By-Reference in C
Can use pointers to approximate call-by-reference.
Callee still receives a copy of the pointer (i.e. call-by-value), but it can modify something in the callerās scope by dereferencing the pointer parameter.
ā¢ ptr[i] is *(ptr+i)
with pointer arithmetic
Pointers and Arrays
- A pointer can point to an array element
- You can use array indexing notation on pointers
ptr[i]
is*(ptr+i)
with pointer arithmetic ā reference the data i elements forward from ptr
- An array nameās value is the beginning address of the array
- Like a pointer to the first element of array, but canāt change
Array Parameters
- Array parameters are actually passed (by value) as pointers to the first array element
These two codes are the same:
void f(int a[]);
int main( ... ) {
int a[5];
...
f(a);
return 0;
}
with
void f(int* a);
int main( ... ) {
int a[5];
...
f(&a[0]);
return 0;
}
Function Pointers
function name
Generic format:
returnType (* name)(type1, ā¦, typeN)
Function Pointer Example
map() performs operation on each element of an array.
- funcptr dereference
- funcptr definition
- funcptr assignment
C allows you to omit &
on a function parameter and omit *
when calling pointed-to function; both assumed implicitly.
#define LEN 4
int negate(int num) {return -num;}
int square(int num) {return num*num;}
// perform operation pointed to on each array element
void map(int a[], int len, int (* op)(int n)) {
for (int i = 0; i < len; i++) {
a[i] = op(a[i]); // dereference function pointer, implicit funcptr dereference (no * needed)
}
}
int main(int argc, char** argv) {
int arr[LEN] = {-1, 0, 1, 2};
map(arr, LEN, square); // no & needed for func ptr argument
Haha, congrats, the first 3 lectures are finished! Hope that I can have a British Shorthair kitten next year!! š±