malloctopus

HW1: malloctopus šŸ™

Submission

We will be using GitHub for distributing and collecting your assignments. At this point you should already have a repository created on your behalf in the cs4157-hw GitHub org. Follow the instructions on the class listserv to get access to that repository. Note that if you submitted the group form late, you will not have a repo yet - please reach out to the TA listserv in that case. You will have to wait until a little after the assignment has been released, but please feel free to work on it independently.

To obtain the skeleton files that we have provided for you, you need to clone your private repository. Your repository page should have a button titled ā€œ< > Codeā€. Click on it, and under the ā€œCloneā€ section, select ā€œSSHā€ and copy the link from there. For example:

$ git clone git@github.com:cs4157-hw/hw1-<id>-<your-team-name>.git

The TAs will use scripts to download, extract, build, and display your code. It is essential that you DO NOT change the names of the skeleton files provided to you. If you deviate from the requirements, the grading scripts will fail and you will not receive credit for your work.

You need to have at least 5 git commits total, but we encourage you to have many more. Your final submission should be pushed to the main branch.

As always, your submission should not contain any binary files.

At a minimum, README.txt should contain the following info:

The description should indicate whether your solution for the part is working or not. You may also want to include anything else you would like to communicate to the grader, such as extra functionality you implemented or how you tried to fix your non-working code.

Answers to written questions, if applicable, must be added to the skeleton file we have provided.

Overview

Each part of this assignment will iterate on the implementation of a memory allocator library with the following API:

Part 1: Bump Allocator

Reading

Concept

To warm up, letā€™s consider an incredibly simple implementation: a bump allocator. The underlying algorithm maintains a pointer to the next free byte on the heap. Each time malloc() gets called, the allocator returns the pointerā€™s current value and then advances the pointer by the allocated blockā€™s size. If thereā€™s not enough space in the heap to serve an allocation request, the heap is extended by calling the sbrk() system call. free() is implemented as a no-op, which means that a program that uses a bump allocator will simply keep increasing heap memory, despite ā€œfreeingā€ blocks it doesnā€™t need anymore.

libmem

To facilitate testing, your code for all parts of this assignment should not call sbrk() directly. Weā€™ve adapted CSAPPā€™s mem_sbrk() library function which allocates memory from a preallocated pool. This allows us to place a limit on the maximum heap size to test failures. In your allocator implementation, use mem_sbrk() instead of sbrk().

This library is initialized and deinitialized using mem_init() and mem_deinit(), which are called by mm_init() and mm_deinit(), respectively.

Tasks

Complete the mm_malloc() function to finish the implementation of the bump allocator.

Here are some requirements:

Part 2: Implicit Free List Allocator

Reading

Tasks

While simple and fast, the bump allocator is not a realistic implementation for a general purpose memory allocator. In this part, we will study CSAPPā€™s implementation of a 32-bit implicit free list allocator in detail, and port it to a 64-bit implementation. Here are the tasks for this part:

  1. The csapp/ directory contains the textbookā€™s 32-bit implementation, as described in CSAPP 9.9.12. We made minor modifications to make the code compile with our libmem and included a simple test driver for you to use. Study the code and play around with it in this directory.
  2. Your 64-bit port of the textbookā€™s code will be implemented in part2/mm.c. Your job is to complete the implementations of all the functions in the file.
    • CSAPPā€™s 32-bit implementation aligns block payloads to 8-byte boundaries. Your 64-bit port should ensure block payloads are aligned to 16-byte boundaries.
    • Instead of using macros, we have provided the following structure definitions and function prototypes:

      typedef struct {
         // Note that this diverges from CSAPP's header semantics. Here, the size
         // field stores the full block size in bytes using 60 bits. This is more
         // than sufficient -- Linux uses 48 or 57 bits for virtual addresses. We do
         // not use any part of those 60 bits to store the allocated bit. The three
         // bits after size are reserved for future use and the last bit is the
         // allocated bit.
          uint64_t      size : 60;
          uint64_t    unused :  3;
          uint64_t allocated :  1;
      
          char payload[];
      } header_t;
      
      typedef struct {
          uint64_t      size : 60;
          uint64_t    unused :  3;
          uint64_t allocated :  1;
      } footer_t;
      
      static inline header_t *header(void *payload) {
          // Implement this function
      }
      
      static inline footer_t *footer(void *payload) {
          // Implement this function
      }
      
      static inline void *next_payload(void *payload) {
          // Implement this function
      }
      
      static inline void *prev_payload(void *payload) {
          // Implement this function
      }
      

      They are meant to replace the following macros used by the CSAPP implementation:

      #define PACK(size, alloc)  ((size) | (alloc))
      #define GET(p)       (*(unsigned int *)(p))
      #define PUT(p, val)  (*(unsigned int *)(p) = (val))
      #define GET_SIZE(p)  (GET(p) & ~0x7)
      #define GET_ALLOC(p) (GET(p) & 0x1)
      #define HDRP(bp)       ((char *)(bp) - WSIZE)
      #define FTRP(bp)       ((char *)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
      #define NEXT_BLKP(bp)  ((char *)(bp) + GET_SIZE(((char *)(bp) - WSIZE)))
      #define PREV_BLKP(bp)  ((char *)(bp) - GET_SIZE(((char *)(bp) - DSIZE)))
      

      Do not use these macros. You should understand what they do and replace their usage with our structures/functions accordingly. Note that header_t uses a C99 flexible array member to refer to the payload that comes right after the header.

  3. Verify that your allocator works correctly. Write programs using sequences of mm_malloc() and mm_free() to test various parts of your code, including:
    • extend_heap()
    • Placement follows first-fit policy
    • Larger blocks are split after placement
    • Freed blocks are coalesced with neighboring blocks
    • mm_malloc() handles out-of-memory errors gracefully
    • 64-bit alignment

Your test drivers should be tracked in your submission repo; however, we will not be grading your test drivers.

Part 3: Explicit Free List

Reading

Task

The implicit free list uses the block size as a ā€œpointerā€ to the next/previous block in the list. This means that searching for a free block in the list requires going through all blocks in the list, which also includes the allocated blocks. In this part, we construct an explicit free list that only links together free blocks.

Recall from the previous part that the minimum user payload size had to be 16 bytes, resulting in a minimum block size of 32 bytes including the header and footer in order to satisfy 64-bit system data alignment requirements. The 16-byte user payload goes unused if the block is free. We can repurpose these unused bytes to hold prev/next pointers to other free blocks to make searching faster.

We now define header_t and add a couple more helper functions as follows:

typedef struct header {
    uint64_t      size : 60;
    uint64_t    unused :  3;
    uint64_t allocated :  1;

    union {
        // These links overlap with the first 16 bytes of a block's payload.
        struct {
            struct header *fprev, *fnext;
        } links;

        char payload[0];
    };
} header_t;

// ...

static inline void *next_free_payload(void *payload) {
    // Implement this function
}

static inline void *prev_free_payload(void *payload) {
    // Implement this function
}

The fprev and fnext pointers link free blocks into the explicit free list, which may be scattered across the heap. Free list neighbors are not necessarily adjacent in memory.

The union construct in C allows us to refer to the same memory location with multiple names. Here, the links overlap with the first 16 bytes of the blockā€™s payload. When the block is allocated, weā€™ll use the payload field to refer to the start of the payload. When the the block is free, weā€™ll use links.fprev and links.fnext to refer to the free list pointers. We use a zero-sized array instead of C99 flexible array member that we used in part 2 since the latter cannot appear in a union.

With this new definition of header_t, update the allocator implementation to make use of the fprev/fnext pointers to traverse the free list. Youā€™ll also have to place newly freed blocks back on to the free list. For simplicity, you should use a LIFO strategy: add newly freed blocks to the beginning of the free list.

Notes and requirements:

Part 4: Segregated Free Lists

Reading

Task

In this part, we will expand your part 3 implementation to maintain an array of segregated explicit free lists. Each segregated free list will only link free blocks of similiar sizes.

Start by copying your part3/ directory to part4/. Modify your code from part 3 to implement the segregated explicit free lists. Here are some top-level definitions you should use in this part:

#define NUM_SEG_LISTS 8
// Segregated free list bucket sizes:
// 0: 32-63 bytes
// 1: 64-127 bytes
// 2: 128-255 bytes
// 3: 256-511 bytes
// 4: 512-1023 bytes
// 5: 1024-2047 bytes
// 6: 2048-4095 bytes
// 7: 4096+ bytes
static char *seg_listp[NUM_SEG_LISTS] = {0};

// This function returns the index into the array segregated explicit free lists
// corresponding to the given size.
// Example: For 32 <= size < 64, this function returns 0.
static inline int seglist_index(size_t size) {
    // Implement this function
}

In part 3, we used a single prologue block as the sentinel node for the explicit free list. In this part, we expand the prologue of the heap to be 8 blocks, one sentinel node per segregated explicit free list. Each element of the seg_listp array points to the corresponding sentinel nodeā€™s payload, just as heap_listp pointed to the single prologue blockā€™s payload in part 3. Be sure to update mm_init() to account for this change. The initial 4KB free block allocated by mm_init() will go into the free list referred to by seg_listp[7].

When free blocks are coalesced and split, be sure to place the resulting block on to the correct segregated free list.

Your find_fit() search algorithm should be updated to start at the snuggest size class and continue to iterate through larger size classes as needed.

You should also update checkheap() and printblock() to account for the new segregated explicit free lists.


Last Updated: 2024-02-01