What is a README
?
A README is a file that gives a broad overview of what is in a directory or repository. If you’re using GitHub, the README shows up on the repository page and is a markdown file (has the suffix .md
). A README
in a data repository (which you’ve seen on your midterm) is usually a plain text file (.txt
).
Why have a README
?
You might know how your code and data are organized, but no one else does. By writing an informative README
, people can read about the repository and then explore it to find what they’re looking for. You can also write a README
for yourself to keep things organized (i.e. understand where you’ve put things so that you can find them again).
It’s an extra little step that future you and anyone else working with your code/dat will use to understand how your files are organized and where they come from.
What does a README
look like on GitHub?
You can look at this repository to see where the README
shows up in a repository. It is one of the first things anyone will see.
You can format your README
using headers and subheaders to make it easier to navigate. The button with three lines in the top right of the README
will display a table of contents.
What should go in a README
?
For this class, any README
should have at least:
General information
This is where a general description of the repo would go. This could include (but is not limited to):
- names of people working in the repo
- where the data came from
- broad research questions and analyses to address those questions
Data and file overview
This is where a description of the data and files could go. For example, you could describe:
- the data file format, when you accessed the data, etc.
- the different code files and what they contain
Rendered output (specifically for this class, but nice to have in other README
files too)
For 193DS assignments, you should put a link to the rendered .html file here so that it is easy to access.
You should have at least a “General information” section, a “Data and file overview” section, and a “Rendered output” section in your README
for full credit.
There are also nice things to have, but for this class usually not necessary:
Methodological information
This is where any information about the methods used to collect, clean, or wrangle the data could go. If you do any cleaning/wrangling outside of the code (for example, directly in the .csv file), then you should describe what you did in this section.
Data-specific information
This is where the metadata would go if you don’t have a metadata file or sheet in your repository.
More information about README
files
- UCSB Library Data Service guide to writing a
README
- Cornell Data Service guide to writing “readme” stype metadata
- Matias Singers’s list of awesome READMEs and README 101
Citation
@online{bui2024,
author = {Bui, An},
title = {Writing an Informative {README}},
date = {2024-05-15},
url = {https://an-bui.github.io/ES-193DS-W23/resources/github-pages.html},
langid = {en}
}