Learning objectives

Background

This week, we will continue our work to describe the data from Pasa and add the data from the Bionutrient Institute. We will learn some more advanced options in R Markdown and ggplot for data visualization and reproducible analysis. You will now work with a combined dataset that includes the Pasa data you explored previously, as well as the full Bionutrient Institute dataset.

Resources

Part 1: Create an R Markdown file for your problem set

Part 2: Set expectations

Take a moment to record your expectations before you begin. Your notes should include the following (3-5 sentences):

Part 3: Data Preparation

Part 4: Exploratory graphs

A: Distribution of nutrient density

Create a well-formatted graph that shows the distribution of your assigned nutrient variable in the two datasets (Pasa and Bionutrient Institute).

Your graph should have the following characteristics:

  • Graph type that is appropriate to show the distribution of a single numerical variable
  • Compares Pasa to Bionutrient Institute data, on the same scale. This can be accomplished with:
    • Faceting to show Pasa vs. Bionutrient Institute in different panels (using facet_grid())
    • Adjusting other ggplot options for the particular graph (i.e. by adjusting aesthetics with aes())
    • Pasa vs. Bionutrient Institute is reflected in the variable group
  • Uses geom_jitter() to show the data behind the distribution

B: Relationship of nutrient density to management

Create a well-formatted graph that compares the distribution of your assigned nutrient variable across levels of ONE management factor of your choice.

Your graph should have the following characteristics:

  • Graph type that is appropriate to compare distribution of a numerical variable across groups
  • Compares Pasa to Bionutrient Institute data, on the same scale. This can be accomplished in different ways:
    • Faceting to show Pasa vs. Bionutrient Institute in different panels (using facet_grid())
    • Adjusting other ggplot options for the particular graph (i.e. by adjusting aesthetics inside aes())
    • Pasa vs. Bionutrient Institute is reflected in the variable group
  • Uses geom_jitter() to show the data behind the distribution

C: Relationship of nutrient density to soil status

Create a well-formatted graph that compares the distribution of your assigned nutrient variable across levels of ONE metric of soil status, of your choice. Measures of soil status include soil organic matter, soil respiration, and soil nutrient concentration. See the data dictionary for variable names and explanations.

Your graph should have the following characteristics:

  • Graph type that is appropriate to visualize the relationship between two numerical variables
  • Shows Pasa vs. Bionutrient Institute in different colors, on the same graph (using color =)

D: Relationship of nutrient density to crop variety

Create a well-formatted graph that compares the distribution of your assigned nutrient variable across crop varieties in the dataset (represented by variable variety).

Your graph should have the following characteristics:

  • Graph type that is appropriate to compare distribution of a numerical variable across groups
  • Compares Pasa to Bionutrient Institute data, on the same scale. This can be accomplished in different ways:
    • Faceting to show Pasa vs. Bionutrient Institute in different panels (using facet_grid())
    • Adjusting other ggplot options for the particular graph (i.e. by adjusting aesthetics inside aes())
    • Pasa vs. Bionutrient Institute is reflected in the variable group
  • Uses geom_jitter() to show the data behind the distribution

Part 5: Compare expectations to data

Revisit the expectations you recorded at the beginning. Examine your graphs and consider them in light of your expectations (3-5 sentences).

Submit your problem set!

Knit your R Markdown file using the Knit button at the top of the code editor. This is a good check on whether your analysis is reproducible!

To access your file, navigate to the Files tab in the lower right window. Find the .html file for your problem set and click the box next to it. Navigate to More –> Export to download the file. It will likely go to your downloads folder.

Examine the file closely to make sure that it knitted correctly and contains all parts of your problem set. If you need to make revisions, you can simply revise your code and then knit it again. Submit the .html file in the appropriate Moodle dropbox.


sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.4

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] workflowr_1.7.1

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5       httr_1.4.7        cli_3.6.2         knitr_1.45       
 [5] rlang_1.1.3       xfun_0.41         stringi_1.8.3     processx_3.8.3   
 [9] promises_1.2.1    jsonlite_1.8.8    glue_1.7.0        rprojroot_2.0.4  
[13] git2r_0.33.0      htmltools_0.5.7   httpuv_1.6.13     ps_1.7.5         
[17] sass_0.4.8        fansi_1.0.6       rmarkdown_2.25    jquerylib_0.1.4  
[21] tibble_3.2.1      evaluate_0.23     fastmap_1.1.1     yaml_2.3.8       
[25] lifecycle_1.0.4   whisker_0.4.1     stringr_1.5.1     compiler_4.3.2   
[29] fs_1.6.3          pkgconfig_2.0.3   Rcpp_1.0.12       rstudioapi_0.15.0
[33] later_1.3.2       digest_0.6.34     R6_2.5.1          utf8_1.2.4       
[37] pillar_1.9.0      callr_3.7.3       magrittr_2.0.3    bslib_0.6.1      
[41] tools_4.3.2       cachem_1.0.8      getPass_0.2-4