For this lab, you will write a simple implementation of the
This simple memory allocation library will give you a much deeper understanding of pointers and memory management.
Building an allocator can be a bit painful: odds are your favorite C functions call a memory allocator, which means you can’t use them inside your allocator or you’ll end up with never-ending recursion.
Please read the following instructions carefully. There are quite a few important details in the implementation of this lab, and getting them wrong can lead to some very difficult-to-diagnose errors.
xxmalloc_usable_size. Given a pointer to some allocated object, you can round that pointer down to the next lowest multiple of 4096 to find the start of the block. Then, you can treat that address as a block header to find the object size. That only works if all blocks have the same object header structure.
MAP_ANONYMOUS. Make sure you include sys/mman.h.
mmapfunction returns a pointer to the start of the page. If your block header struct is called
block_header_t, you can cast this pointer to a
block_header_t*and write to it.
xxfreehave to work if the pointer is at the very end of the object?
The starter code at this lab is available on GitHub at https://github.com/csc213/malloc.
The starter code includes a simple, somewhat-functional heap implementation in
allocator.c, along with a wrapper that allows you to replace the default allocator with your custom allocator when you run a program.
The “somewhat-functional” heap uses the
mmap system call to request memory from the operating system every time
malloc is called.
This is a terrible allocator!
mmap system call returns pages of memory to the program, meaning we’re requesting at least 4096 bytes of memory for every allocation, even if we only allocate an
In addition to wasting an enormous amount of memory,
free is not implemented so we never actually reclaim this space.
While the starter code implements a very bad allocator, it does work for some programs.
Most notably, it does not work for programs that call
malloc often enough that the operating system runs out of space.
You can run some simple utities like
ls or editors like
vim if you don’t open a large file.
To run a program with your allocator instead of the default allocator, you will need to use the
LD_PRELOAD environment variable.
This is a special environment variable that
ld, the dynamic linker (linker dynamic?), looks at before starting a program.
Any libraries listed in this environment variable will be loaded before system libraries.
That means any functions you implement in a preloaded library will replace the default implementation.
There are some tricks to doing this correctly with a memory allocator (for one, the dynamic linker allocates memory, which creates some odd circular dependencies) so I have included a wrapper from a system called heap layers that takes care of most of this for you.
You just need to implement functions called
Leave the files
To run a program with your custom allocator, use a command line like the following (omit the dollar sign. This is how I show the shell prompt):
$ LD_PRELOAD=./myallocator.so ls
This assumes you’re running the command from the root of your project directory.
Everything should run as it usually does.
To make sure your allocator is actually running, you should break something.
Go to the
xxmalloc function inside
allocator.c and add a line
return (void*)-1; to the beginning of the function.
make, test your allocator again.
This time, the
ls command should fail, most likely with a segmentation fault.
Remove this intentional error now so it doesn’t cause problems later!
You can also run more complex programs like
$ LD_PRELOAD=./myallocator.so vim
The program will probably start up okay, but if you open a large source file you’ll probably run out of memory and
vim will crash.
We’ll fix that later.
Speaking of crashes, you may find it useful to run programs with your allocator inside a debugger.
To do this, you have to start things a little differently.
This session starts the program
ls with your custom allocator inside of gdb.
$ gdb ls ... Lots of gdb startup messages go here ... > set environment LD_PRELOAD=./myallocator.so > run
If you want to start the program with command line options, they go immediately after the
run command (on the same line).
Starting with our Very Bad Allocator™, you will implement what is often called a BiBoP allocator. BiBoP, short for “Big Bag of Pages”, is an old and well-known technique for allocating memory that uses a size-segregated heap. The job of a memory allocator is to request large blocks of memory from the OS, then parcel out smaller pieces of that memory to programs that request it. You could imagine an allocator that cleverly splits large blocks into smaller pieces, called allocated objects, when needed. A clever allocator could then coalesce adjacent freed objects into larger free objects so they can be used to satisfy larger requests in the future. While this approach has its place, it’s often not worth the effort. A size-segregated heap designates each large block of memory to hold objects of a single size. If you need to allocate objects of a different size you have to get them from a different block. This simplifies the allocator, and actually has some nice performance advantages as well. For one, we can store the size of allocated objects once per block instead of next to every object, so there is less wasted space for metadata.
It’s easiest to describe the structure of a BiBoP allocator by explaining how it handles requests.
Our allocator enters the picture when the main program calls
The code in
gnuwrapper.cpp handles some boilerplate, and it then calls
We’re supposed to find a 27 byte space in memory and return it to the program.
Because a BiBoP allocator dedicates an entire block to one size of object, we don’t want to keep blocks around for every odd object size;
instead, our allocator rounds up to the next power of two.
Some systems also require a minimum alignment for allocated memory of 16 bytes, so you should round any smaller requests up to 16.
Now we’re looking for 32 bytes of free memory.
A BiBoP allocator keeps a separate list of blocks (those are the large chunks of memory from the OS) for each object size.
We’ll have eight lists of blocks, one to hold blocks dedicated to 16 byte objects, one for 32 byte objects, one for 64 byte objects, and so on up to 2048 bytes.
We’ll handle larger objects by calling
The beginning of each list of blocks is reachable through a pointer to the beginning of the block.
Keeping an array of these pointers makes it easy to select the appropriate list;
take the log2 of the object size, then subtract log2 of the minimum object size.
This gives you the index of the appropriate block list.
You can compute the log base two of a power of two integer using the
__builtin_ctz function described on the GCC Builtins page.
In fact, you can write all of the rounding and size manipulation rountines for this lab without loops or large numbers of
To receive full credit, you should use reasonable, concise implementations when dealing with array indices or rounding.
Make sure you test your rounding code extensively. This has historically been the most common source of errors later in the lab.
In our case, we take log2(32), which is 5, and subtract the log of our smallest object, log2(16), which is 4.
Looking in the array of block lists, we check index 1.
If this pointer isn’t
NULL it gives us the address of the first block we can use to hold 32-byte objects.
At the beginnng of the first block in the block list, we have a block header.
This structure should hold the size of the objects on this block, the
next pointer that takes us to the next block, and the freelist pointer.
The freelist pointer points to the first free object in this block, which is guaranteed to be 32 bytes because of how the heap is structured.
We’re going to allocate this free block of memory, so we should save it to a temporary, look inside the object to find a pointer to the next free object, then use this next free object as the head of the free list.
What is happening here is that we’ve used the free space to store a linked list to record all the free space our allocator currently has available.
Yes, this is a little mind bending;
I will draw some pictures of this on the board.
If the freelist was
NULL, this block does not have any free objects so we should move on to the next one.
If none of the available blocks have free objects, proceed on to the next step.
If there are no size 32 blocks, we need to make one by asking the OS for memory.
To do this, call
mmap as in the provided starter code to allocate
Once you have a page of memory, you should put a block header at the beginning of this available memory.
Break the remaining memory on the page into 32 byte chunks (or whichever size is appropriate for the block list you are currently looking in).
For simplicity and correctness, you should divide the remaining space into 32 byte aligned chunks;
in general, you should align these chunks to the block’s object size.
If you don’t do this, your allocator may not work and freeing memory will be very difficult.
Add each chunk (a free object) to the block’s freelist.
You should use the available space inside each free object to store a linked list node.
At some point later, the program calls
The code in
gnuwrapper.cpp does some boilerplate again and calls along to
All we have at this point is a pointer to somewhere inside the object.
We don’t know the size of the object, and we don’t know if the pointer is at the beginning of the object or somewhere in the middle.
The first step is to find the size of the object.
You will need to know an object’s size to free it, but you’ll also have to implement
xxmalloc_usable_size for the allocator to work properly.
You can call
xxmalloc_usable_size from inside
xxfree, so you might as well start out implementing this helper function.
Block headers go at the beginnings of pages, and every page starts at a multiple of
Because of this, you can round a pointer down to the next multiple of
PAGE_SIZE to find its header.
Once you have the header you can simply read out the object size from the block header.
Now that we know the size of the object being freed, we need to figure out where it starts (the program may be freeing from an interior pointer). Because you carefully pareceled out free objects aligned to the object size (you did do that, right?) this just requires rounding down to the next multiple of the object size.
Finally, you have pointers the block header and the beginning of the object being freed, so you can just add it to the block’s free list. It’s usually best to reuse freed memory as soon as possible, so add the freed object to the front of the free list.
Before writing any code in your memory allocator, you should implement the computations to round object sizes to the next power of two and select the appropriate block list.
This code is difficult to test inside of your allocator because you cannot call functions like
printf when you are inside of
The only POSIX functions you are allowed to call are the async signal safe functions listed on this page.
You can see that I have added a slight hack to print error messages (sometimes) when things go wrong.
If you set
true you can make exactly one call to a function like
printf that may call
malloc to display an error, but you should plan to quit the program after doing this because the heap will almost certainly be corrupted when
printf frees this memory.
It’s not a great way to debug your program, but sometimes our options are limited when we’re messing with the innards of the runtime or operating system.
I recommend implementing your rounding code in a separate test program to make sure it works, then move it into your allocator once you trust it.
xxmalloc for objects up to 2048 bytes.
You will need the code from part A to choose the appropriate block list, along with new code to allocate new blocks, initialize the new block’s freelist, and to take a free object off the freelist and return it to the program.
We don’t have any code to compute the size of an allocated object yet, but you can assume that any allocated memory is at least 16 bytes.
xxmalloc_usable_size so it always returns 16.
gnuwrapper.cpp file defines
realloc, which calls
xxmalloc_usable_size to check if an object can be resized.
Once you’ve updated
xxmalloc_usable_size to return the conservative lower bound of 16 you should still be able to run some simple programs, even though freeing does not work yet.
If you test a program and it crashes, check to see if
xxmalloc is being asked to allocate more than 2048 bytes at a time (try a gdb breakpoint or use the emergency error message hack included in
If so, you may have to skip ahead to part D before you can test your allocator;
many large object allocations happen as part of program startup and can be difficult to avoid.
Implement the size calculation and freeing process described above.
You should have a complete implementation of
xxmalloc_usable_size after this step.
You may want to implement and test your rounding down procedure in a separate program where you can print test results, instead of testing it live in your allocator.
We don’t want to deal with large objects using our BiBoP structure, so we’ll just fall back to
For any object larger than 2048 bytes, just round the
mmap request up to a multiple of
PAGE_SIZE and return the result to the program.
One difficult issue with large objects is in the
Pages that hold large objects do not have a block header, but we may go looking for one.
A standard technique to determine whether a page has a block header is to add a magic number to the beginning of each BiBoP block.
Add a field to your block header struct and fill it in with a recognizable constant (I particularly like
In your size computation, check for this constant in the block header before returning a value.
If it isn’t there, you know you are looking at a large object instead of a BiBoP block.
If a large object is passed to
xxfree you can simply disregard it.
If you would like to earn some extra credit, you can implement a special large object header that stores the size of the allocated object.
You may need to go looking backwards for this header because large objects could span multiple pages.
To mark the beginning of a large object, use a different magic number.
Once you’ve found this point, you can free the large object back to the OS by calling
There are other, better strategies for supporting large objects, like reserving a dedicated large object area. If you want to attempt one of these alternative strategies I can help you plan a reasonable implementation.
Your final submitted allocator should work on simple command line utilities, as well as with
top, and the
lynx command line web browser.
In addition to testing your allocator wiht some simple programs, you can use the same grading script I will use to assign a portion of your grade on this lab.
You can find this code at https://github.com/csc213/malloc-grader.
To use the grader, run
make in the grader directory, then run the
grader program with
LD_PRELOAD set to use your allocator.
The grader enforces some properties that are specific to this allocator design, so even the standard libc malloc will not receive full credit on this lab.
The requirements for your allocator are:
malloc(24)should return a pointer to 32 bytes of memory that is aligned at an address that is a multiple of 32.
If you find that you are failing some of these tests, make sure your implementation of
xxmalloc_usable_size is working correctly.
This function is an important part of the testing process, so an incorrect implementation can cause many other tests to fail.
As usual, once your lab is complete you should submit your work with a pull request.