greptile

HW2: greptile 🦎

Submission

We will be using GitHub for distributing and collecting your assignments. At this point you should already have a repository created on your behalf in the cs4157-hw GitHub org. Follow the instructions on the class listserv to get access to that repository.

To obtain the skeleton files that we have provided for you, you need to clone your private repository. Your repository page should have a button titled “< > Code”. Click on it, and under the “Clone” section, select “SSH” and copy the link from there. For example:

$ git clone git@github.com:cs4157-hw/hw2-<id>-<your-team-name>.git

The TAs will use scripts to download, extract, build, and display your code. It is essential that you DO NOT change the names of the skeleton files provided to you. If you deviate from the requirements, the grading scripts will fail and you will not receive credit for your work.

You need to have at least 5 git commits total, but we encourage you to have many more. Your final submission should be pushed to the main branch.

As always, your submission should not contain any binary files.

At a minimum, README.txt should contain the following info:

The description should indicate whether your solution for the part is working or not. You may also want to include anything else you would like to communicate to the grader, such as extra functionality you implemented or how you tried to fix your non-working code.

Answers to written questions, if applicable, must be added to the skeleton file we have provided.

Overview

In this assignment, you will implement a multi-threaded grep-like utility called greptile. The utility will recursively search a directory for files that contain a given string. In Part 1, you will implement the core functionality of greptile in a single-threaded manner. In Part 2, you will parallelize the search using POSIX threads, creating a utility similar to ripgrep (a.k.a. rg).

Your program must compile with no warnings, and should not produce any memory leaks or errors when run under valgrind. This requirement applies to all parts of the assignment.

Part 1: Single-threaded greptile 🐍

Reading

Requirements

Testing

You should test your implementation of greptile against a directory containing a large number of subdirectories and files. We have provided a copy of the Linux kernel source code in the /opt/asp directory on SPOC. You should use this directory to test your program, and are encouraged to test your program with other directories as well. You can use the provided sample executable to verify that your program is working correctly.

Deliverables

Part 2: Multi-threaded greptile 🦖

Requirements

Deliverables

Part 3: Performance Evaluation (Optional) 🐢

Reading

Tasks

This part is optional and will not be graded.

In this part, you will evaluate the performance of your greptile program, using two different implementations. In Part 1, you used malloc() to allocate a buffer for reading the file. In this part, you will use the mmap() function to create a file-backed mapping instead.

This part is purposefully left open-ended to encourage you to think about performance and to experiment with different tools and techniques.


Acknowledgments

Tal Zussman implemented the solution for this assignment in Spring 2024.


Last Updated: 2024-09-25