Due on Tuesday June 11 at 11:59 PM
Description
This is part 2 of your final. You must complete all components individually (i.e. not in a group).
If you worked in a group for part 1: create a new Quarto document titled lastname-firstname_final-part-02
in the same GitHub repo that you used for part 1. Do problems 5-7.
If you worked alone for part 1: continue in your Quarto document from part 1. Do problems 5-7.
This part of the final is open communication, open note, open internet, etc.
Problem 5. Affective and exploratory visualizations
Skills you will demonstrate
In this problem, you will demonstrate your ability to communicate about your visualization and give feedback to others. You will also demonstrate your ability to design and execute an appropriate statistical analysis for your data.
Problem
a. Sharing your affective visualization
This is a component you will complete in workshop on Thursday 6 June. We will be taking attendance that day. If you attended class, you will receive full credit for this section.
You will present your affective visualization and give feedback to 2-3 other classmates in class via the Google form linked on Canvas.
b. Comparing visualizations
In 8-12 sentences, compare and contrast your affective visualization from Homework 3/workshop, the exploratory visualization you made for Homework 2, and the communicative/exploratory visualization you made for the midterm. Some prompts:
- How are the visualizations different from each other?
- What similarities do you see between all your visualizations?
- What patterns (e.g. differences in means/counts/proportions/medians, trends through time, relationships between variables) do you see in each visualization? Are these different between visualizations? If so, why? If not, why not?
Problem 6. Data analysis
Skills you will demonstrate
In this part of the final, you will analyze your data using the response and predictor variable(s) you have measured. You will demonstrate your ability to design and execute the appropriate analysis for your own data set.
Problem
a. Response and predictor variables
List your response and predictor variable(s) and describe what kind of data (for example: continuous, categorical, ordinal, binary, etc.) each variable is.
b. Articulate a question
Write your “research question”. For example, “what is the effect of ____ on ____?” or “what are the differences in ____ on ____?”
c. Describe what kind of test or model you would use
Describe why that test or model would be appropriate given your response and predictor variables.
You have to use one of the tests/models we learned about in this class. Make sure your response variables are appropriate for those tests!
d. Follow-ups
Describe what kind of effect size, post-hoc, correlation coefficient, prediction calculation, or any other follow-up you would do and why.
e. Analyze your data
Do your analysis that you outlined in parts c and d. Include the code and output for all parts. Annotate your code.
f. Visualize your results
Create a visualization that reflects the analysis you ran. For example, if you ran a t-test, you should show means (and some metric of spread or uncertainty). If you ran a Kruskal-Wallis test, you should show medians (and some metric of spread or uncertainty). If you ran a linear model, you should show model predictions with 95% confidence intervals.
For your visualization, be sure to show the underlying data.
Finalize your figure, get rid of visual clutter, and use colors/shapes/other aesthetics as appropriate.
h. Write a methods section.
Describe
- what variables you collected
- if you assigned ratings or categories to variables (for example: productivity on a scale of 1-5, tiredness on a scale of 1-10), describe how you determined the rating or category
- how/when you collected your data
i. Write a results section.
Describe
- what you found (with the appropriate summary of the statistical tests and sample size)
- your interpretation of your results (with reference to your figure)
Problem 7. Statistical critique
Skills you will demonstrate
By now, you have a sense of what tests are appropriate given a response and predictor variable(s). In this problem, you will critique the choice of statistical analysis that other people (the authors of the paper) have made.
You will be evaluated on the correctness of your critique (i.e. )
a. Find the methods
Revisit the methods of the paper you chose. If there are multiple methods, choose one that is in the list of methods we cover in this class:
- t-test
- Analysis of variance (ANOVA)
- Mann-Whitney U
- Kruskal-Wallis
- Wilcoxon rank sum
- Linear model or linear regression
- Spearman correlation
- Pearson correlation
- logistic regression
- Generalized linear mixed effect model
b. Method description
What is the response variable and what is the predictor? If a mixed effect model, what are the random effects?
c. Assumption check
What assumptions have to be met for using this method? If a generalized linear model, what type of error distribution is appropriate for the response variable?
d. Critique
- Was this an appropriate method for the authors to use? Why or why not? Justify your critique using the response and predictor variable(s) and the experimental design.
- If not, what would be a more appropriate method to use and why? Justify your suggestion using the response and predictor variable and the experimental design.
e. Communication
- How transparent were the authors about their statistical method? Did they report the components that you would expect for a thorough report (not just the p-value, but test statistic, sample size, degrees of freedom, etc.)? If not, what is missing?
- What follow-up tests/statistics (for example, an effect size, correlation, post-hoc, etc.) did the authors calculate or communicate about (e.g. model predictions/coefficients)? Why were these tests/statistics appropriate for the main statistical analysis? If they did not do any follow up, what would you suggest given their original test?