CSE333(4-6): Leak

Today, I learnt CSE333 lecture 4 - 6. A concept called Memory Leak is introduced.

Is Memory leak similar to Emotion Leak?

In life, an Emotion Leak may be a type of resource leak that occurs when a Cranial nerve in the brain/mood incorrectly manages emotion allocations in a way that emotion which must be erased is not released.

So unexpected or horrible things will happen.

2020_12_02_1

Lecture 4: The Heap and Structs

Memory Allocation

3 types of memory allocations:

  • statically-allocated (global var)
    • Allocated when program is loaded
    • Deallocated when program exits
  • automatically-allocated (local var)
    • Allocated when function is called
    • Deallocated when function returns
  • Dynamic Allocation, when first 2 are not enough
    • We need memory that persists across multiple function calls but not for the whole lifetime of the program
    • We need more memory than can fit on the stack
    • We need memory whose size is not known in advance

Dynamic case example in pseudo-code:

// this is pseudo-C code
char* ReadFile(char* filename) {
	int size = GetFileSize(filename);
	char* buffer = AllocateMem(size);
	
	ReadFileIntoBuffer(filename, buffer);
	return buffer;
}
  • dynamically-allocated memory in run-time
  • the memory persists until
    • the code explicitly deallocate it (manual memory management)
    • A garbage collector collects it (automatic memory management)
  • C requires to manually manage memory

Aside: NULL

NULL is a memory location that is guaranteed to be invalid.

  • In C on Linux, NULL is 0x0 and an attempt to dereference NULL causes a segmentation fault.
  • Useful as an indicator of an uninitialized (or currently unused) pointer or allocation error
int main(int argc, char** argv) {
	int* p = NULL;
	*p = 1; // causes a segmentation fault
	return 0;
}

Heap-allocated Memory

malloc() and free()

malloc()

Usage:

var = (type*) malloc(size in bytes)

malloc allocates a block of memory of the requested size in the heap:

  • found in stdlib.h
  • Returns a pointer to the first byte of that memory
    • And returns NULL if the memory allocation failed
  • You should assume that the memory initially contains garbage
  • You’ll typically use sizeof to calculate the size you need
// allocate a 10-float array
float* arr = (float*) malloc(10*sizeof(float));
if (arr == NULL) {
return errcode;
}
... // do stuff with arr
calloc()
  • Like malloc, but also zeros out the block of memory in the heap
  • found in stdlib.h
  • Helpful when zero-initialization wanted (but don’t use it to mask bugs – fix those)
  • void *calloc(int n, int size);
recalloc()
  • Reallocate the void pointer p to be n Bytes, and at the same time it will copy the original content to the new allocated space in the heap. If the original space is less than N bytes, it will keep the same.
  • found in stdlib.h
  • void * realloc(void * p,int n);
free()
  • Deallocates the memory pointed-to by the pointer
    • Pointer must point to the first byte of heap-allocated memory (i.e. something previously returned by malloc or calloc)
    • Freed memory becomes eligible for future allocation
    • The bits in the pointer are not changed by calling free
    • Defensive programming: can set pointer to NULL after freeing it
  • Free the space that pointer p points to.
  • found in stdlib.h
  • void free (void * p);
float* arr = (float*) malloc(10*sizeof(float));
if (arr == NULL) return errcode;
... // do stuff with arr
free(arr);
arr = NULL; // OPTIONAL

The heap

The Heap is a large pool of available memory used to hold dynamically-allocated data.

malloc maintains bookkeeping data in the Heap to track allocated blocks

2020_12_02_2

The heap and stack example given in the lecture notes is very good: page 14 - 24 here

Memory Corruption

Example memcorrupt.c:

#include <stdio.h>
#include <stdlib.h>
int main(int argc, char** argv) {
	int a[2];
	int* b = malloc(2*sizeof(int));
	int* c;

	a[2] = 5; // assign past the end of an array
	b[0] += 2; // assume malloc zeros out memory
	c = b+3; // mess up your pointer arithmetic
	free(&(a[0])); // free something not malloc'ed
	free(b);
	free(b); // double-free the same block
	b[0] = 5; // use a freed (dangling) pointer

	// any many more!
	return 0;
}

Memory leaks

A memory leak occurs when code fails to deallocate dynamically-allocated memory that is no longer used

  • e.g. forget to free malloc-ed block, lose/change pointer to malloc-ed block

What happens: program’s VM footprint will keep growing

  • This might be OK for short-lived program, since all memory is deallocated when program ends

  • Usually has bad repercussions for long-lived programs

    • Might slow down over time (e.g. lead to VM thrashing)
    • Might exhaust all available memory and crash
    • Other programs might get starved of memory
  • It cloud be invisible and accumulative, and hard to detect

structs and typedef

structs

  • simplestruct.c
struct Point {
	float x, y;
};

int main(int argc, char** argv) {
	struct Point p1 = {0.0, 0.0}; // p1 is stack allocated
	struct Point* p1_ptr = &p1;
	p1.x = 1.0;
	p1_ptr->y = 2.0; // equivalent to (*p1_ptr).y = 2.0;
	return 0;
}
  • structassign.c
#include <stdio.h>
	struct Point {
	float x, y;
};
int main(int argc, char** argv) {
	struct Point p1 = {0.0, 2.0};
	struct Point p2 = {4.0, 6.0};
	printf("p1: {%f,%f} p2: {%f,%f}\n", p1.x, p1.y, p2.x, p2.y);
	p2 = p1;
	printf("p1: {%f,%f} p2: {%f,%f}\n", p1.x, p1.y, p2.x, p2.y);
	return 0;
}

2020_12_02_3

typedef

// make "superlong" a synonym for "unsigned long long"
typedef unsigned long long superlong;
// make "str" a synonym for "char*"
typedef char *str;
// make "Point" a synonym for "struct point_st { ... }“
// make "PointPtr" a synonym for "struct point_st*"
typedef struct point_st {
	superlong x;
	superlong y;
} Point, *PointPtr; // similar syntax to "int n, *p;"
Point origin = {0, 0};

Dynamically-allocated Structs

  • malloc and free structs
// a complex number is a + bi
typedef struct complex_st {
	double real; // real component
	double imag; // imaginary component
} Complex, *ComplexPtr;
// note that ComplexPtr is equivalent to Complex*
ComplexPtr AllocComplex(double real, double imag) {
	Complex* retval = (Complex*) malloc(sizeof(Complex));
	if (retval != NULL) {
		retval->real = real;
		retval->imag = imag;
	}
	return retval;
}

Structs as Arguments

  • Structs are passed by value, like everything else in C
    • Entire struct is copied
    • To manipulate a struct argument, pass a pointer instead
void DoubleXBroken(Point p) { p.x *= 2; }       // It's broken.
void DoubleXWorks(PointPtr p) { p->x *= 2; }	// This works!

Returning Structs

  • Often in %rax and %rdx for small structs
  • Often returned in memory for larger structs
  • Value passed: passing a pointer is cheaper and takes less space unless struct is small
  • Field access: indirect accesses through pointers are a bit more expensive and can be harder for compiler to optimize

Lecture 5: Data Sturctures and Modules

Implement Data Structures in C

  • Simple Linked List

manual_list.c

#include <stdio.h>
typedef struct node_st {
	int element;
	struct node_st* next;
} Node;

int main(int argc, char** argv) {
	Node n1, n2;
	n1.element = 1;
	n1.next = &n2;
	n2.element = 2;
	n2.next = NULL;
	return 0;
}

Push Onto list - Memory Leak

#include<stdio.h>
#include<stdlib.h>
#include<assert.h>
typedef struct node_st {
    int element;
    struct node_st* next;
} Node;
Node* Push(Node* head, int e) {
    Node* n = (Node*) malloc(sizeof(Node));
    assert(n != NULL); // crashes if false
    n->element = e;
    n->next = head;
    return n;
}

int main(int argc, char** argv) {
    Node* list = NULL;
    list = Push(list, 1);
    list = Push(list, 2);
    return 0;
}

A (benign) memory leak! Try running with Valgrind:

bash$ gcc –Wall -g –o push_list push_list.c
bash$ valgrind --leakcheck=full ./push_list

Install Valgrind

Valgrind:

brew install valgrind

Try to run in the docker, or if you are using Mojave MacOS, use the commands below instead, follow this solution: upgrade to 10.15.4, and download ‘Xcode’.

Will try to install tomorrow.

$ brew install automake
$ git clone https://github.com/Echelon9/valgrind.git
$ cd valgrind
$ git checkout feature/v3.14/macos-mojave-support-v2
$ ./autogen.sh
$ ./configure --prefix=/where/you/want/it/installed --enable-only64bit
$ make

Multi-file C Programs

C Preprocessor Intro

  • C Header Files
  • C Module Conventions
  • #include and the C Preprocessor

Lecture 6: Final C Details

Header Guards and Preprocessor Tricks

  1. Macros
#define ODD(x) ((x) % 2 != 0)
#define WEIRD(x) x % 2 != 0    // Beware of operator precedence issues 
  1. Conditional Compilation
#ifdef TRACE
#define ENTER(f) printf("Entering %s\n", f);
#define EXIT(f) printf("Exiting %s\n", f);
#else
#define ENTER(f)
#define EXIT(f)
#endif

// print n
void pr(int n) {
	ENTER("pr");
	printf("\n = %d\n", n);
	EXIT("pr");
}

Visibility of Symbols

  1. preprocessor value given in gcc command:
gcc -Wall -g -DTRACE -o ifdef ifdef.c
  1. NDEBUG param will cause assert to expand to empty
gcc -Wall -g -DNDEBUG -o faster useassert.c

Namespace Problem

Q: Is a global variable named ‘counter’ in one C file visible in a different C files in the same program?

extern

  • yes if external linkage
    • ‘counter’ refers to the same var in both files
    • The variable is defined in one file and declared in the other(s)
    • When the program is linked, the symbol resolves to one location
    • Global variable has external linkage by default.
#include <stdio.h>

extern int counter;  // Here, we declare it, and specify external linkage by using the extern specifier
void bar() {
	counter++;
	printf("(b): counter = %d\n", counter);
}

static

  • no if use internal linkage
    • ‘counter’ refers to the diff vars in both files
    • the variable must be defined in each file
    • When the program is linked, the symbol resolves to two locations
    • static(in the global context) restricts a definition to visibility within that file
#include <stdio.h>
static int counter = 1; // We force internal linkage by using the static specifier.
int main(int argc, char** argv) {
	printf("%d\n", counter);
	bar();
	printf("%d\n", counter);
	return 0;
}

Function Visibility

// By using the static specifier, we are indicating
// that foo() should have internal linkage. Other
// .c files cannot see or invoke foo().
static int foo(int x) {
	return x * 3 + 1;
}

Good Practise:

  • Use static to “defend” your globals, and hide your private stuff.
  • Place external declarations in a module’s header file. Header is the public specification.

Haha, congrats, 4th-6th lectures are finished…

2020_12_02_4

References