6 CARE and FAIR in presenting data with dynamic documents
This lab will work through planning and envisioning CARE and FAIR approaches to data in Hawaiʻi as well as visualizing and numerically summarizing the data we will use in this course. You can download the template .qmd file for this lab here:
Note: the yaml header of this file is a little more complex than we have seen before
---
title: "CARE and FAIR in presenting data with dynamic documents"
format:
pdf:
include-in-header:
text: |
\usepackage{newunicodechar}
\newunicodechar{ʻ}{`}
author:
- your name here
- your name here
- etc
---
You now can see how to cleanly include multiple authors. We also see our first example of customizing PDF output with \usepackage which is how we load a LaTeX package (same idea as an R package), in this case to support directly typing the unicode ʻokina.
The template will prompt you to work through the following sections.
6.1 CARE and FAIR consideration in Hawaiʻi
Write out your manaʻo on CARE and FAIR in Hawaiʻi generally using class materials, class readings, and most especially your previously assigned written reflections on those readings. This section might serve as background on the reasons for aspiring to CARE and FAIR principles and any challenges we might face in operationalizing them.
6.1.1 Ideas for making data from Hawaiʻi CARE and FAIR
Using as a specific example the data we’re working with in this course, propose some ideas for making the data comply with CARE and FAIR. Recall, the data sources are the OpenNahele database hosted on Data Dryad and the Hawaiʻi Climate Data Portal hosted through the University of Hawaiʻi. How could these data repositories, and the specific data sets/resources, become more CARE and FAIR?
6.2 Describing OpenNahele and Hawaiʻi Climate Data Portal Data
Let’s read-in the data from a url rather than having to download the data repeatedly or refer to data stored in a different project on our local computers
- join all data like we did in the R refresher lab (fine to copy paste over) for subsequent use
- create a plot-level data object like we did in the R refresher lab (fine to copy paste over) for subsequent use
- across the entire data set (not broken up by island or anything else) calculate 25%, 50%, and 75% quantiles for each numeric plot-level variable but not
YearorCensusorlonorlat. Hints:- you can use the
quantilefunction, you will need to setna.rm = TRUE - you can make a
data.frameby hand if that makes most sense - or you can use
acrossin combination withreframe(which is the dplyr approach)
- you can use the
- use a ridgeline plot or stacked histograms to show distributions for all variables relating to species richness, basal area, and abundance by island
- for all figures and tables, include a caption and also refer to the figure/table in narrative text that explains why you are making the visualization/table
6.3 Relating tree diversity to environmental predictors
- create new plot-level columns for species richness—not proportions—that corrects for area (should be, e.g.,
log(nspp) / log(Plot_Area)) - make a figure for the response of species richness to plot area (as we did in R refresher lab, fine to copy paste)
- make a figure showing, across all islands, proportion of plots with \(\geq 75\%\) native trees (as we did in R refresher lab, fine to copy paste)
- choose three other figures to make where you plot any kind of diversity-related response variable (e.g. species richness, native species richness) against an environmental predictor variable
- for all figures, include a caption and also refer to the figure in narrative text that explains why you are making the visualization