Practice using R Markdown to implement reproducible data analysis in
R
Practice basic steps in loading, checking, and preparing a dataset
for analysis
Correctly apply and interpret a one-sample t-test and associated
confidence intervals
Explore patterns in historical datasets to gain insight into
nutrient decline
Background
This week, we will shift gears a bit to delve in more detail into the
historical evidence for nutrient decline. This will involve re-analyzing
historical data (from either Mayer 1997 or Davis et al. 2004) to test
whether nutrient decline of particular nutrients is evident in
vegetables. In the process we will learn how to conduct and interpret a
t-test and associated confidence intervals in R.
Part 1: Create an R Markdown file for your problem set
Make sure you are working in the R Project for this week in Posit
Cloud (called Lab_05_Historical-data)
This is important because if I have a question about what you did or
how your code is working, I need to be able to find/access your code on
Posit Cloud
Create a new R Markdown document using the green plus sign in upper
left
Title your document PS 5: Historical data
Navigate to File -> Save As
Save the file as follows: 05_ps_Study_Nutrient.Rmd -
replacing Study and Nutrient with the name of
your study (Davis or Mayer) and nutrient
(e.g. Calcium)
Write your student ID number (not your name) as the
author of the script
Use subheadings to organize your document into the sections shown
below
Use ## for main headings
Use ### for subheadings
Use code chunks to organize your code within each section (for those
sections needing code)
Make sure that every one of your code chunks is named in
the chunk header
Use R Markdown chunk options to make your output more readable. You
are welcome to suppress output for initial data processing steps, but
please show both code and output for all ‘deliverables’ outlined
below.
Part 2: Set expectations
Take a moment to record your expectations before you begin. Your
notes should include the following (3-4 sentences):
Do you think nutrient density for your assigned nutrient has
declined in vegetables over time? Why or why not?
Based on your previous answer, what do you expect to be the mean or
median value of the Response Ratio? (= new nutrient
value / old nutrient value)
Based on your previous knowledge, do you have any expectations about
which types of vegetable(s) will be the richest sources of your assigned
nutrient?
Part 3: Analyze historical data
Prepare your dataset
Load libraries you will need for this dataset:
tidyverse and DT
Prepare your assigned historical dataset for analysis (either
Davis_et_al_2004_clean.csv or
Mayer_1997_clean.csv)
Load the dataset
Use filter() to include only your assigned nutrient and
store it as a new data frame.
Use mutate() to create a new variable for the
response ratio. This should be calculated as the newer
value divided by the older value for the same crop.
Use arrange() to order your data frame from high to low
by nutrient concentration. Remember that the minus sign (-) can be used
to adjust the order of sorting.
You should arrange by the variable that represents the most recent
value for nutrient concentration
Check your data and make sure it loaded correctly and that the range
of values for the response ratio appears reasonable.
Note: If you are working with the Mayer 1997 data, please
also filter to only include vegetables! (not
fruits)
Create a table
Use datatable() to display your filtered dataset,
arranged from high to low nutrient density/concentration by crop
Should be arranged by the most recent variable representing nutrient
concentration
Test whether nutrients have declined
Your goal now is to conduct a test to determine whether nutrient
content of vegetables has declined for your nutrient.
See Cheat Sheets –> t-tests for more guidance on how to implement
this in R
Your problem set should include the following steps:
Check the normality assumption for the distribution of the response
ratio
This should include a visual inspection and a formal test
If the data are not normal, try a transformation and check
again
Conduct a standard t-test and a Wilcoxon signed rank test to find
out whether there is evidence for nutrient decline in this nutrient and
dataset
Hint: What should the response ratio be if
nutrients have not declined?
Please use a 95% confidence limit, as is standard for the field
Please use a two-sided test, even though our question is mostly
one-sided (explanation
here)
Please use the raw data for both tests, not the transformed
data
Think about which of the two tests is most appropriate, given the
outcome of the normality assessment
Don’t worry if you see the warning: “cannot compute exact p-value”.
There are different ways to compute P values, an ‘exact’ method that is
more computationally intensive, and an ‘approximate’ method that is
slightly less accurate but easier to compute. In practice, it rarely
matters which you choose so long as your dataset is sufficiently large
(and ours should be okay). If you like, you can turn this warning off by
setting exact = FALSE.
Interpret your results (2-3 sentences)
Did you find evidence for nutrient decline in your nutrient?
Write two sentences summarizing the results of your test, as you
would in the Results section of a scientific paper. Your summary should
incorporate the results of the test and the confidence interval. Report
whichever test you believe is more appropriate for your data (t-test or
Wilcoxon signed rank test).
Enter your results into the class spreadsheet
See comments on column headers for tips on where to find the
requested information.
What is the relevance of your results for our project with
Pasa?
Submit your problem set!
Knit your R Markdown file using the Knit button at the
top of the code editor. This is a good check on whether your analysis is
reproducible!
To access your file, navigate to the Files tab in the
lower right window. Find the .html file for your problem set and click
the box next to it. Navigate to More –>
Export to download the file. It will likely go to your
downloads folder.
Examine the file closely to make sure that it knitted correctly and
contains all parts of your problem set. If you need to make revisions,
you can simply revise your code and then knit it again. Submit the
.html file in the appropriate Moodle dropbox.