This practical is under
development.
Built with 4.5.2
Welcome to the practical classes of the final investigation of ENVS278: Marine Ecology Field Studies! For those of you I haven’t met, my name is Ellie and I am a university teacher in marine biology and ecology here in the School of Earth and Environmental Sciences at Liverpool. My own research looks at how an Antarctic seabird, the snow petrel, uses sea ice habitats. For this, I use a combination of biotelemetry (through GPS tracking) and biogeochemistry (for investigating their diet). In general, my interests are in what drives how animals interact with their environment.
At Blakemere Moss, you collected…
We will have two four-hour computer practicals. From these sessions, you will achieve the following Learning Outcomes:
This block of the module will be assessed via a report…
One of the aims of these practical sessions is to familiarise you…
R and RStudio are some of the most frequently used softwares in ecology. They are free, open-source tools widely used for data analysis, statistics, and geospatial work.
If you are working on a university computer:
Check version.
If you are not working on a university computer:
To get started, first download R from the Comprehensive R Archive Network (CRAN), which provides the core programming language and statistical environment.
After installing R, download RStudio from Posit. RStudio is an integrated development environment (IDE) that makes it easier to write code, manage projects, visualize data, and install packages.
Together, R and RStudio provide a powerful platform for analysing environmental and spatial datasets used in GIS and environmental science.
TASK: Please read the (short) introductory chapter of R for Data Science - this should only take 5 - 10 minutes and will provide you with a good foundation for the rest of the practical.
There are also some excellent tutorials available on Youtube, and I’ve listed a few of my favourites below:
Specifically for spatial data, I also have a webpage where I collate useful resources (those of you on ENVS255 will be familiar with it):
In today’s session, we have 4 tasks to complete:
For the first section of today’s practical, you will be formatting your data from the field. On the Canvas, you will find templates for each of your groups to input your data.
Picture of labelled console etc
In this practical,we’re going to be using R projects. Using R Projects helps keep your work organised, reproducible, and easy to manage. By working within a project, all your scripts, data, and outputs live in one self-contained folder, so you don’t have to worry about file paths breaking when you move between computers or share your work. R Projects also make it easier to integrate with version control tools like Git, helping you track changes and collaborate with others. In general, they encourage a clean workflow where analyses are easier to understand and rerun!
How to set up a new R project:
Within your R project folder, create the following hierarchical file structure:
project/
├── scripts/
├── plots/
└── data/
├── counts/
└── watches/
Your project folder should look something like this (with your chosen name):
To create a new script, go to File > New file > R Script.
Throughout these practicals, we’re going to be emphasising the importance of annotating our code as we go.
# You can use hashtags to tell R that a line is not a piece of code, and use them to add notes to your R script
Example title:
# ENVS278 Practical 1
# Black-headed Gull Behaviour
# yy/mm/dd
You install R packages using the syntax:
install.packages("package_name")
# install the package
install.packages("tidyverse")
If you want to install several packages at once, you can do so using the following syntax:
# install the packages 'cowplot' and 'ggpubr'
install.packages(c("cowplot", "ggpubr"))
Once installed (which you should only have to do once), you can start your R session by loading the package into your library for that session. Note that when installing a package you use “” around the package name, but not when loading it to your library.
You load the installed R packages to your library using the syntax:
library(package_name)
# load the package 'tidyverse' into your library
library(tidyverse)
The syntax for loading multiple libraries is not the same as installing multiple packages. If you’re comfortable with R, and would like to try loading multiple libraries at once, the code for this is in the Optional Extras section in next week’s practical.
If you’re getting started with R, the simplest way to load multiple packages is just to manually list them:
# load ggpubr and cowplot to your library
library(ggpubr)
library(cowplot)
NOTE: You should only need to install a package once, but you will have to add it to the library for each R session.
So your first properly annotated R script might look like this:
In the field last week you collected data that can be used to test for inter-rater reliability. This exercise tests the reliability of the ethogram you have built, and makes sure that all observers are following the same definition of each behaviour. That way, no unnecessary bias is introduced by human error.
Last week, each dyad in your group watched the same individual for a set period of time. In a perfect world, both pair member’s assessment of that individual’s behaviour would be identical, but this is rarely the case. The Cohen’s Kappa test produces a measure of reliability between two observers between 0- 1. 1 is perfect agreement (i.e. both observers had exactly the same results) while 0 is total disagreement.
Traditionally, a threshold Cohen’s Kappa Score of > 0.8 is used to indicate an acceptable level of agreement. This is the threshold you are aiming for today! On the canvas page, you will find an example Cohen’s kappa dataset. As a group, format this for each dyad.
Before moving onto the next session, make sure all the dyads within your data collection group have achieved a Cohen’s kappa coefficient of >0.8
Load the package for calculating Cohen’s kappa:
# install package
install.packages("psych")
Open it in your library:
# load package
library(psych)
Then load in your data set for comparison:
# open the csv file with your own data (you will need to modify the name of the csv file below)
Kappa_data <- read.csv("intereliability_data/Kappa Example Dataset.csv")
# check that the data set is properly loaded and looks fine
View(Kappa_data)
#calculate Cohen's Kappa
cohen.kappa(Kappa_data) # in the output you are interested in the unweighted kappa row, and the estimate column
## Call: cohen.kappa1(x = x, w = w, n.obs = n.obs, alpha = alpha, levels = levels,
## w.exp = w.exp)
##
## Cohen Kappa and Weighted Kappa correlation coefficients and confidence boundaries
## lower estimate upper
## unweighted kappa 0.72 0.85 0.98
## weighted kappa 0.87 0.93 0.99
##
## Number of subjects = 61
Is the score > 0.8?
Repeat this for every dyad in your group.
ggplot2 is the best package for plotting data in R,
offering a plethora of options for customization. There are many
tutorials out there, but I’ve found the best way of thinking of any
ggplot plot is the ‘layer cake’ analogy:
install.packages("ggplot2")
library(ggplot2)
The last task in today’s practical is to visualise your count data. There are many ways to visualise data through time, and here I’ve demonstrated one.
Load in your count data:
counts <- read.csv("count_data/count_summary.csv")
Plot it up (this is an example - think about how you could visualise your data, and what you could highlight. What about the gap in observations?)
# Convert time column
counts$Time <- as.POSIXct(counts$Time, format = "%H:%M")
# Calculate mean
counts$Mean <- rowMeans(counts[, c("A","B","C","D","E")])
# Plot
ggplot(counts, aes(x = Time)) +
geom_line(aes(y = A, colour = "A")) +
geom_line(aes(y = B, colour = "B")) +
geom_line(aes(y = C, colour = "C")) +
geom_line(aes(y = D, colour = "D")) +
geom_line(aes(y = E, colour = "E")) +
geom_line(aes(y = Mean), linetype = "dashed", linewidth = 1) +
labs(
title = "Blakemere Moss Colony Counts",
x = "Time",
y = "Count",
colour = "Observer"
) +
theme_bw()
Next week, we’ll examine the time budgets of the gulls relating to the behaviours we observed, and make a map of the study site using R. We’ll also go into to the report in more depth. Now enjoy the sun!