Writing an informative README

making sure people know what they’re looking at
README
GitHub
Author
Affiliation
Published

May 15, 2024

What is a README?

A README is a file that gives a broad overview of what is in a directory or repository. If you’re using GitHub, the README shows up on the repository page and is a markdown file (has the suffix .md). A README in a data repository (which you’ve seen on your midterm) is usually a plain text file (.txt).

Why have a README?

You might know how your code and data are organized, but no one else does. By writing an informative README, people can read about the repository and then explore it to find what they’re looking for. You can also write a README for yourself to keep things organized (i.e. understand where you’ve put things so that you can find them again).

It’s an extra little step that future you and anyone else working with your code/dat will use to understand how your files are organized and where they come from.

What does a README look like on GitHub?

You can look at this repository to see where the README shows up in a repository. It is one of the first things anyone will see.

You can format your README using headers and subheaders to make it easier to navigate. The button with three lines in the top right of the README will display a table of contents.

What should go in a README?

For this class, any README should have at least:

General information

This is where a general description of the repo would go. This could include (but is not limited to):

  • names of people working in the repo
  • where the data came from
  • broad research questions and analyses to address those questions

Data and file overview

This is where a description of the data and files could go. For example, you could describe:

  • the data file format, when you accessed the data, etc.
  • the different code files and what they contain

Rendered output (specifically for this class, but nice to have in other README files too)

For 193DS assignments, you should put a link to the rendered .html file here so that it is easy to access.

Note

You should have at least a “General information” section, a “Data and file overview” section, and a “Rendered output” section in your README for full credit.

There are also nice things to have, but for this class usually not necessary:

Sharing and accessing information

This is where any information regarding data/code reuse and access would go. This is mostly relevant if you’re working with your own data or data that your collaborators have collected.

Methodological information

This is where any information about the methods used to collect, clean, or wrangle the data could go. If you do any cleaning/wrangling outside of the code (for example, directly in the .csv file), then you should describe what you did in this section.

Data-specific information

This is where the metadata would go if you don’t have a metadata file or sheet in your repository.

More information about README files

Citation

BibTeX citation:
@online{bui2024,
  author = {Bui, An},
  title = {Writing an Informative {README}},
  date = {2024-05-15},
  url = {https://an-bui.github.io/ES-193DS-W23/resources/github-pages.html},
  langid = {en}
}
For attribution, please cite this work as:
Bui, An. 2024. “Writing an Informative README.” May 15, 2024. https://an-bui.github.io/ES-193DS-W23/resources/github-pages.html.