Lesson 5 - Biome evolution
Please download the data for this lesson at: Quercus_datasets
A central theme of biogeography is to reconstruct ancestral distributions of biotas over evolutionary time scales to understand the processes by which biotas arose and how they may evolve in the future. Understanding how biodiversity evolved can inform modeling of the past, present and future of that biodiversity and can aid in:
The data most commonly used for reconstructing the ancestral geographical distributions of organisms includes:
Another method for inferring biome evolution is provided by slicing a dated phylogenetic tree at successive time depths.
Davies and Buckley presented a novel framework that explored the diversity of lineages present at different time slices through the phylogenetic tree to gain insight into the evolutionary past of present-day diversity of mammals.
We develop a similar framework in the R package phyloregion to explore the origin of contemporary biogeographic regions in southern Africa, but we successively remove phylogenetic structure in the tree linking the woody plant species in the regional flora to identify the evolutionary depth at which biogeographic regions collapse into each other.
The phylogenetic depth at which biogeographic regions become indistinguishable may indicate the time at which they differentiated historically.
rm(list = ls())
library(terra)
library(phyloregion)
library(ape)
library(Matrix)
myfishnet <- function(mask, res) {
s <- rast(res = res, ext(mask))
crs(s) <- "epsg:4326"
m <- as.polygons(s, dissolve = FALSE)
m$grids <- paste0("v", seq_len(nrow(m)))
m <- m[, "grids"]
m[mask, ]
}
mylong2sparse <- function(x, grids = "grids", species = "species") {
x <- as.data.frame(x)
x <- x[, c(grids, species)]
names(x) <- c("grids", "species")
grids <- factor(as.character(x$grids))
species <- factor(as.character(x$species))
res <- sparseMatrix(as.integer(grids), as.integer(species),
x = 1, dimnames = list(levels(grids), levels(species)))
res
}
u <- "Quercus_datasets/"
g <- vect(paste0(u, "USA_vector_polygon/States_shapefile.shp"))
g <- g[!(g$State_Name %in% c("ALASKA", "HAWAII")), ]
tr <- read.tree(paste0(u,"phylogeny/quercus_phylogeny.tre"))
f <- list.files(paste0(u, "geodata/Quercus"),
full.names = TRUE) |> lapply(vect)
v <- vect(f)
v$binomial <- gsub(" ", "_", v$binomial)
my_grids <- myfishnet(mask = g, res = 1)
j <- relate(v, my_grids, "intersects", pairs = TRUE)
y <- my_grids$grids[j[, 2]]
spp <- v$binomial[j[, 1]]
r <- data.frame(species = spp, grids = y)
y <- mylong2sparse(r)
tmp <- data.frame(grids = row.names(y), abundance = rowSums(y),
richness = rowSums(y > 0))
z <- merge(my_grids, tmp, by = "grids")
mypolys2comm <- function(dat, res = 0.25, pol.grids = NULL, ...){
if (!is.null(pol.grids)) {
m <- pol.grids[, grep("grids", names(pol.grids)), drop = FALSE]
}
else {
m <- myfishnet(mask = ext(dat), res = res)
crs(m) <- "+proj=longlat +datum=WGS84"
}
pj <- "+proj=longlat +datum=WGS84"
m <- suppressWarnings(invisible(project(m, "epsg:4326")))
crs(m) <- pj
j <- relate(dat, m, "intersects", pairs = TRUE)
y <- m$grids[j[, 2]]
spp <- dat$binomial[j[, 1]]
r <- data.frame(species = spp, grids = y)
y <- mylong2sparse(r)
tmp <- data.frame(grids = row.names(y), abundance = rowSums(y),
richness = rowSums(y > 0))
z <- merge(m, tmp, by = "grids")
return(list(comm_dat = y, map = z))
}
z <- mypolys2comm(dat = v, pol.grids = my_grids)
mydat <- z$comm_dat
index <- intersect(tr$tip.label, colnames(mydat))
tr1 <- keep.tip(tr, index)
pbc <- phylobeta(mydat, tr, index.family = "sorensen")
The phyloregion
package provides functions for analysis of compositional turnover (beta diversity) based on widely used dissimilarity indices such as Simpson, Sorensen and Jaccard. The phyloregion’s functions beta_diss
and phylobeta
compute pairwise dissimilarity matrices for large sparse community matrices and phylogenetic trees for taxonomic and phylogenetic turnover respectively. The results are stored as distance objects for downstream analyses.
We will use the the function select_linkage
to contrast eight hierarchical clustering algorithms (including UPGMA) on the (phylogenetic) beta diversity matrix for degree of data distortion using cophenetic correlation coefficient. The
cophenetic correlation coefficient measures how the original pairwise distance matrix is represented by the dendrogram.
y <- select_linkage(pbc[[1]])
barplot(y, horiz = TRUE, las = 1)
The optimal_phyloregion
function computes the ‘elbow’ (also known as ‘knee’), i.e., the point of maximum curvature, to determine the optimal number of clusters that best describes the observed (phylogenetic) beta diversity matrix.
(d <- optimal_phyloregion(pbc[[1]], k=15))
plot(d$df$k, d$df$ev, ylab = "Explained variances",
xlab = "Number of clusters")
lines(d$df$k[order(d$df$k)], d$df$ev[order(d$df$k)], pch = 1)
points(d$optimal$k, d$optimal$ev, pch = 21, bg = "red", cex = 3)
points(d$optimal$k, d$optimal$ev, pch = 21, bg = "red", type = "h")
The function phyloregion
estimates evolutionary distinctiveness
of each phyloregion by computing the mean value of (phylogenetic) beta diversity between a focal phyloregion and all other phyloregions in the study area.
zz <- phyloregion(pbc[[1]], pol = my_grids)
```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(class.source=’fold-show’) knitr::opts_chunk$set(collapse = TRUE) library...