Trending

Step-by-Step Guide to Building a Species Distribution Model (SDM) in R

Introduction to Species Distribution Models (SDM)

Species Distribution Models (SDMs) predict where species are likely to occur based on environmental variables and species presence-absence data. These models are critical for ecological research, conservation planning, and understanding the impacts of environmental changes on biodiversity.

In this guide, we’ll demonstrate how to create an SDM in R using simulated data. The code includes generating environmental data, modeling species presence-absence, and visualizing species distribution.

Key Steps in Building an SDM in R

Step 1: Setting Up the Environment

Before diving into modeling, load the required R libraries. Here’s what we’ll use:

library(ggplot2)  # For visualization

library(dplyr)    # For data manipulation

set.seed(42)      # For reproducibility

These libraries are essential for data processing and visualization in R.

Step 2: Simulating Data for SDM

To model species distribution, we simulate a dataset representing latitude, longitude, and environmental variables (e.g., temperature, precipitation, and elevation). The following code generates 100 random points with these variables:

n <- 100


latitude <- runif(n, min = -90, max = 90)

longitude <- runif(n, min = -180, max = 180)

temperature <- runif(n, min = -10, max = 40)  # Celsius

precipitation <- runif(n, min = 0, max = 2000)  # mm/year

elevation <- runif(n, min = 0, max = 4000)  # meters


presence <- ifelse(

  temperature > 10 & temperature < 30 & precipitation > 500 & precipitation < 1500,

  1, 

  0

)


sdm_data <- data.frame(

  Latitude = latitude,

  Longitude = longitude,

  Temperature = temperature,

  Precipitation = precipitation,

  Elevation = elevation,

  Presence = presence

)

The variable presence is calculated based on predefined environmental thresholds, simulating where the species is likely to occur.

Step 3: Exploring the Dataset

Before building the model, inspect the dataset:

print(head(sdm_data))

summary(sdm_data)

This step ensures the data is clean and provides insights into the distributions of environmental variables.

Step 4: Visualizing Species Distribution

1. Map of Species Presence

The first plot maps species presence-absence across geographical coordinates:

ggplot(data = sdm_data, aes(x = Longitude, y = Latitude)) +

  geom_point(aes(color = factor(Presence)), size = 3) +

  scale_color_manual(values = c("red", "blue"), 

                     labels = c("Absent", "Present"),

                     name = "Species Presence") +

  labs(title = "Species Distribution Model (SDM)",

       x = "Longitude",

       y = "Latitude") +

  theme_minimal()

This plot visualizes species presence (blue) and absence (red) across longitudes and latitudes, providing an intuitive understanding of the species’ geographical range.

2. Environmental Variables vs. Species Presence

The second plot explores how environmental factors influence species presence:

ggplot(data = sdm_data, aes(x = Temperature, y = Precipitation)) +

  geom_point(aes(color = factor(Presence)), size = 3) +

  scale_color_manual(values = c("red", "blue"), 

                     labels = c("Absent", "Present"),

                     name = "Species Presence") +

  labs(title = "Environmental Variables vs Species Presence",

       x = "Temperature (°C)",

       y = "Precipitation (mm/year)") +

  theme_minimal()

This scatter plot highlights how specific combinations of temperature and precipitation correlate with species presence.

Step 5: Preparing for Real-World Applications

For real-world SDMs, data is often imported from external sources like Excel files or databases. R provides robust tools to handle such data:

library(readxl)

SDM_dataset <- read_excel("SDM_dataset.xlsx")

View(SDM_dataset)

This code demonstrates how to load external datasets into R for modeling.

Benefits of Using R for SDMs

  • Flexibility: R supports diverse modeling techniques (e.g., logistic regression, MaxEnt).
  • Visualization: ggplot2 enables customized, publication-ready visualizations.
  • Reproducibility: R scripts ensure consistent results and transparency in research.

Conclusion

Building a Species Distribution Model in R is an insightful way to explore species-environment relationships. By simulating data, visualizing patterns, and interpreting results, you can uncover ecological insights that are essential for conservation and environmental management.

The provided R script offers a foundation for understanding SDMs and can be easily adapted for real-world datasets. Whether you’re a beginner or an experienced ecologist, this guide is a stepping stone toward mastering SDMs in R.

Post a Comment

Previous Post Next Post