Due on Wednesday April 17 (Week 3) at 11:59 PM
Read these instructions before starting your homework and follow them carefully. See the end of this assignment for a checklist of components that your assignment must have at minimum (i.e. to earn at least partial credit). Only submit the items in that list, in the order requested.
Frequently Asked Questions
I’m having trouble rendering to PDF. What can I do?
You could either install all the additional things R is asking for you to install, or you can render to a word doc instead (change pdf
in the top part of the document to docx
) and save that doc as a PDF.
I don’t know how to insert an image into a Quarto document. How do I do that?
Here is a resource for Quarto. If you rendered your Quarto file to a word document, you can insert an image into that word document the same way you would with any other word doc.
Where is the feedback for problem 4?
It is in the google sheet.
Part 1. Tasks
You will not need to submit any materials for tasks, but you are expected to complete the material.
Task 1. Beginning steps to configure Git/GitHub
Check that Git is installed on your computer. It likely already is, but check anyway!
- Instructions for Macs
- Instructions for Windows
- If you do not have git installed, follow the instructions to install it.
If you don’t have one already, create a GitHub account.
- Read Jenny Brian’s Happy Git with R Section 4.1: “Username advice” (for example, An’s GitHub username is an-bui and Caitlin’s is cnordheim-maestas)
- Visit GitHub
- Create an account using your personal email (not your UCSB email, which you will lose access to once you graduate)
- Read Jenny Brian’s Happy Git with R Section 4.1: “Username advice” (for example, An’s GitHub username is an-bui and Caitlin’s is cnordheim-maestas)
Task 2. Read Julia Lowndes’ piece in Scientific American: “Open Software Means Kinder Science”
Part 2. Problems, code, and figures
Carefully read the checklist for the components that you will need to submit for each problem. Show all your work.
You’ll want to set up a directory specifically for this homework assignment within your class directory. For example, your class directory is called ENVS-193DS
, while your directory for this homework assignment could be called something like ENVS-193DS_homework-01
. Then, you can create a project within the directory for this assignment. Afterward, you can save your code and data into the same directory.
Problem 1. Measures of central tendency and data spread (8 points)
After the major rains this winter, you’re interested in what’s happening with the Pacific treefrog (Pseudacris hypochondriaca) population at North Campus Open Space. You’ve collected the following masses (in grams) for tree frogs in one night of sampling (frogs can be hard to catch!):
\[ 23, 32, 39, 25, 35, 28 \]
- In one sentence, categorize this data set: what type of data did you collect, and why is it that type? (2 points)
- Calculate the sample mean. Express your answer with the correct units. (2 points)
- Calculate the sample variance. Express your answer with the correct units. (2 points)
- Calculate the sample standard deviation. Express your answer with the correct units. (2 points)
Problem 2. Visualizing data (9 points)
In this problem, you’ll work with data collected by the National Snow and Ice Data Center on glacial mass and sea level rise.
Getting set up:
Read the overview information about the data set from the database.
Download the two data files from Canvas (linked in the Homework 1 assignment page) into your directory (aka folder) for this assignment.
- Data file 1:
glacial_volume_loss_copy.csv
- Data file 2:
glacial_volume_loss.csv
- Data file 1:
Open up the two data files and look at them side-by-side. In one sentence, explain how the data files are different. (1 point)
Read the metadata (data/information about the data) in data file 2 (
glacial_volume_loss.csv
) to understand what each column means.
Coding:
- Make sure your data files and your script (R script, Quarto, or RMarkdown document) is in the same folder as the data file.
- Load the
tidyverse
package.
- Read in data file 1 using
read_csv(“glacial_volume_loss_copy.csv”)
and store that as an object namedglaciers
.
- Create a histogram of annual sea level rise. (4 points)
- Create a scatterplot of cumulative sea level rise through time (year on the x-axis, cumulative sea level rise on the y-axis). (4 points)
Problem 3. Personal data (20 points total)
This quarter, you’ll collect data from your own life to see how data science concepts are part of your daily existence. For this homework assignment, you’ll come up with two ideas for data collection. The data you collect:
- has to be something you can get at least 30 observations on by week 10 (e.g. minutes to get from ENVS 193DS to your next class, not number of shark views per week)
- can’t something you can get from data collection objects (e.g. number of steps in a day)
- has to be something that you could actually remember to write down (e.g. liters of water consumed in a day, not time spent on tiktok)
- has to be be shaped by a question (e.g. how much water do i drink in a day?)
- has to include variables that would be appropriate to share with the class
For each idea you have (remember you have to come up with two ideas), you should:
- articulate a question (2 points each)
- describe when you would write that data down (2 points each)
- describe what other variables you think you should measure (2 points each)
- describe what type of data your variables are (2 points each)
- design a data sheet with some example data: what are the columns and what are the rows? (2 points each)
Example:
- Question: How many different types of vegetables do I eat?
- I would take data after every meal.
- Date, time of day, type of vegetable, meal, whether or not it’s cooked, if it’s eaten with other things
- Date: continuous
- time of day: categorical (morning, afternoon, evening)
- meal: categorical (breakfast, lunch, dinner, snack)
- cooked: binary (yes/no)
- eaten with other things: binary (yes/no)
An will give you feedback and recommendations for what to pursue for this project on Canvas on Thursday the 18th of April. That means that you should be able to start collecting data by the end of week 3, if not sooner.
Problem 4. Setting up statistical critique (6 points)
Throughout the quarter, you’ll engage in a critique of statistical methods for a published paper. Some methods are appropriate for the data and research questions, and some are not. You’ll be the judge.
For this homework assignment, you will find 3 candidate papers for your critique. Find 3 papers that speak to your interests - the paper could be on human health, plant restoration, agroecology, or more. Anything you might be interested in within the realm of environmental studies is fair game. Not all 3 papers have to be on the same topic.
For each paper, read the Abstract to get a general sense of what the paper is about. Then, read the Methods section, looking for information on statistical analysis. A paper is a good choice if it includes one of these terms (or something similar) in the analysis description:
- t-test
- Analysis of variance (ANOVA)
- Mann-Whitney U
- Kruskal-Wallis
- Wilcoxon rank sum
- Linear model or linear regression
- Spearman correlation
- Pearson correlation
- logistic regression
- Generalized linear mixed effect model
- t-test
Once you’ve verified that your paper includes at least one of the above listed terms, find the digital object identifier (DOI), which is a unique identifier in the form of a URL for a paper. You will know it is a DOI if it has doi.org somewhere in the URL.
Once you find the DOI for your paper, add it to the Google form. Repeat this for all three papers. (3 points)
If you want to see what other people have chosen, see the class responses here.
- In your homework document, list the papers in alphabetical order by author last name. (3 points)
Your citations should take the form:
Last name, first name, et al. Year. “Paper title.” Journal title volume:issue.
Example:
Sanford, E., et al. 2019. “Widespread shifts in the coastal biota of northern California during the 2014–2016 marine heatwaves.” Scientific Reports 9:4216.
Checklist
Your homework should
- Include your name, the title (“Homework 1”), and the date you turned in the assignment (3 points)
- Include responses for Problem 1a-d and full work (hand written or R code) for parts b-d
- Include a written response to Problem 2c and all your code, annotations, and output for Problem 2h-i
- Include 3 citations for Problem 3c
- be uploaded to Canvas as a single PDF (1 point)
- be organized and readable (2 points)
Additionally, you should
- Paste 3 DOIs for the papers you’re interested in in the Google form
49 points total