The primary aim of this hands-on exercise is to generate wall-to-wall raster products describing forest structural complexity for a selected area of interest (AOI) using airborne laser scanning (ALS) data. Participants will derive core terrain and canopy surfaces—Digital Terrain Model (DTM), Digital Surface Model (DSM), and Canopy Height Model (CHM)—and compute a set of structural complexity metrics that characterise vertical variability, vertical diversity, and horizontal heterogeneity of forest canopies.
The entire processing workflow is implemented in R, providing a fully reproducible and open-source environment for processing, analysing, and visualising airborne LiDAR data for forest and ecological applications.
The exercise follows a sequential, grid-based workflow. First, raw ALS point cloud data are read, inspected, and projected to a common coordinate reference system. ALS point clouds are normalized and classified. Forest structural complexity is then quantified on a regular grid by summarising the normalised LiDAR point cloud within each pixel. Multiple complementary metrics are computed to describe different dimensions of canopy structure, including vertical variability, vertical layering and diversity, and spatial heterogeneity. Together, these metrics provide a spatially explicit representation of forest structural complexity across the study area.
packages <- c(
"lidR",
"RCSF",
"terra",
"sf",
"dplyr",
"ggplot2",
"plot3D"
)
# install only missing packages
missing_pkgs <- packages[!packages %in% installed.packages()[, "Package"]]
if (length(missing_pkgs) > 0) {
install.packages(missing_pkgs, dependencies = TRUE)
}
# load packages quietly
suppressMessages(
lapply(packages, library, character.only = TRUE)
)LiDAR point clouds are commonly distributed in several formats, including LAS, LAZ, PLY, and others. Here, users can upload LiDAR datasets by selecting the desired file path. Once loaded, key LiDAR features such as coordinate system, point density, number of points, memory usage, and acquisition type are printed to support informed processing decisions.
# ============================================================
# 1. LiDAR File Selection
# ============================================================
las_path <- file.choose() # Select LAS/LAZ file interactively
# ============================================================
# 2. LiDAR Data Import with Fallback Reader
# ============================================================
las <- tryCatch(
readALSLAS(las_path),
error = function(e) {
message("readALSLAS() failed. Using lidR::readLAS() instead.")
lidR::readLAS(las_path)
}
) # Read LiDAR file with robust error handling
# ============================================================
# 3. Input Data Validation
# ============================================================
if (is.empty(las)) stop("LAS file is empty") # Stop if no points are present
# ============================================================
# 4. Coordinate Reference System Assignment
# ============================================================
#lidR::projection(las) <- sf::st_crs(32633) # Assign projected CRS (UTM zone 33N)
# ============================================================
# 5. LiDAR Object Inspection
# ============================================================
# ============================================================
# 6. Point Density and Point Spacing (Global Metrics)
# ============================================================
# Point density (pts/m²) represents the average number of LiDAR points
# per unit area over the entire dataset. By default, it considers ALL
# returns stored in the LAS file (first, intermediate, and last returns).
# Point spacing (m or cm) is the theoretical average distance between
# neighboring LiDAR points, derived from total area and total point count.
# It is a global summary metric and does NOT describe local point spacing
# or pulse spacing.
las
# cat("------------\n")
# A <- area(las)
# cat("Point spacing (cm):", sqrt(A/npoints(las))*100, "\n")
# cat("Point density (pts/m^2):", npoints(las)/A, "\n")This step extracts a small circular subset of the LiDAR point cloud and displays it in 3D to visually illustrate canopy structure, height variation, and point distribution before further processing.
Ground classification separates terrain points from vegetation and objects, essential for terrain models and normalising LiDAR heights. Ground points are identified using the Cloth Simulation Filter (CSF: Zhang et al., 2016), which simulates a virtual cloth over the inverted point cloud. Parameters control cloth resolution, rigidity, and convergence, adapting to terrain roughness. The classified point cloud offers a reliable ground surface for terrain and canopy analysis.
Height normalisation is the process of transforming raw LiDAR elevation values into heights relative to the ground surface. By removing the influence of terrain variability, this step enables vegetation and object heights to be compared consistently across areas with different elevations and slopes.
Several methods can be used for height normalisation, depending on data characteristics and terrain complexity. Common approaches include:
In this workflow, a TIN-based method is adopted because it provides a balanced and flexible representation of the terrain surface, particularly in areas with variable topography. This approach effectively preserves local ground variations while ensuring robust normalisation for canopy structure analysis.
# ============================================================
# 1. Height Normalization of LiDAR Point Cloud
# ============================================================
nlas <- normalize_height(las, tin()) # Normalize heights using TIN-based ground model
# ============================================================
# 2. Filtering of Non-Physical Points
# ============================================================
nlas <- filter_poi(nlas, Z >= 0 & Z <= quantile(Z, 0.99, na.rm = TRUE))In this step, forest structural complexity is mapped across the entire area of interest using a regular grid (e.g., 3 m). Metrics are computed directly from the height-normalised LiDAR point cloud (heights above ground) by summarising points within each grid cell using the pixel_metrics() function from the lidR package. These metrics capture complementary aspects of vertical variability, vertical complexity, and canopy rugosity, which are key descriptors of forest structure beyond simple canopy height or cover.
Height Standard Deviation (HSD) quantifies vertical variability within each grid cell and reflects the spread of canopy heights. Higher values indicate vertically heterogeneous or multi-layered canopies, while low values indicate uniform or simple vertical structure.
\[ HSD = \sqrt{\frac{1}{n-1}\sum_{i=1}^{n}\left[z_i - \bar{z}\right]^2} \qquad \text{(Eq1)} \]
where \(z_i\) are normalised LiDAR heights and \(\bar{z}\) is their mean.
-Vertical Complexity Index (VCI)
The Vertical Complexity Index (VCI) describes the evenness of the vertical distribution of canopy elements across height layers. Following the Shannon entropy formulation, VCI ranges from 0 to 1, where higher values indicate a more evenly distributed (vertically complex) canopy and lower values indicate concentration of vegetation within few height layers.
\[ VCI = \frac{-\sum_{h=1}^{H} p_h \ln\left[p_h\right]}{\ln\left[H\right]} \qquad \text{(Eq2)} \]
where \(p_h\) is the proportion of LiDAR returns in height layer \(h\), and \(H\) is the total number of occupied height layers within the pixel.
-Canopy Relief Ratio (CRR)
The Canopy Relief Ratio (CRR) measures relative canopy rugosity, describing how canopy mass is distributed between lower and upper canopy layers. CRR is dimensionless and ranges between 0 and 1. Values closer to 1 indicate canopy mass concentrated near the top of the canopy (top-heavy or vertically complex structure), while lower values indicate bottom-heavy or vertically uniform canopies.
\[ CRR = \frac{\bar{z} - z_{\min}}{z_{\max} - z_{\min}} \qquad \text{(Eq3)} \]
where \(\bar{z}\) is the mean canopy height, \(z_{\min}\) is the minimum height (ground level), and \(z_{\max}\) is the maximum canopy height within the pixel.
VCI_fun <- function(Z, dz = 1, zmin = 0.5, n_min = 10) {
# keep canopy points only (above zmin) and finite values
Z <- Z[is.finite(Z) & Z > zmin]
if (length(Z) < n_min) return(NA_real_)
zmax <- max(Z, na.rm = TRUE)
# if canopy heights fall in a single layer, VCI = 0
# (need at least 2 bins => at least 2 breaks)
if (!is.finite(zmax) || zmax < (zmin + dz)) return(0)
# build breaks (ensure >= 2 breaks)
brks <- seq(zmin, ceiling(zmax), by = dz)
if (length(brks) < 2) return(0)
bins <- cut(Z, breaks = brks, include.lowest = TRUE, right = FALSE)
p <- prop.table(table(bins))
p <- p[p > 0]
H <- length(p) # number of occupied height layers
if (H <= 1) return(0) # no vertical diversity
# normalized Shannon entropy (0..1)
as.numeric(-sum(p * log(p)) / log(H))
}
CRR_fun <- function(Z, baseline = 0, n_min = 10) {
Z <- Z[is.finite(Z)]
if (length(Z) < n_min) return(NA_real_)
zmax <- max(Z, na.rm = TRUE)
if (!is.finite(zmax) || zmax <= baseline) return(NA_real_)
as.numeric((mean(Z, na.rm = TRUE) - baseline) / (zmax - baseline))
}
struct_metrics <- function(Z) {
list(
HSD = sd(Z, na.rm = TRUE),
VCI = VCI_fun(Z),
CRR = CRR_fun(Z)
)
}
struct_stack <- pixel_metrics(
nlas,
~struct_metrics(Z),
res = 3,
progress = FALSE
)
names(struct_stack)## [1] "HSD" "VCI" "CRR"
This visualization displays the Digital Terrain Model (DTM), Digital Surface Model (DSM), and Canopy Height Model (CHM) side by side, allowing visual comparison of terrain elevation, surface structure, and vegetation height patterns across the study area.In addition, it displays canopy cover (CC) mapped at three height thresholds (≥ 1 m, ≥ 2 m, ≥ 3 m) using two complementary definitions: CC (all returns): proportion of all LiDAR returns within each pixel above the height threshold (captures overall vegetation occupancy through the canopy). CC (first returns): proportion of first returns only above the threshold (emphasises the upper canopy surface and canopy closure from above).
This step exports the final raster outputs as GeoTIFFs, including the Digital Terrain Model (DTM), Digital Surface Model (DSM), Canopy Height Model (CHM), and canopy cover maps (≥ 1 m, ≥ 2 m, ≥ 3 m) derived from both all returns and first returns, for further analysis and visualization.
out_dir <- file.path(dirname(las_path), "ForestSAT2026_02_Exercise")
dir.create(out_dir, showWarnings=FALSE)
terra::writeRaster(struct_stack[["HSD"]], file.path(out_dir, "HSD.tif"),overwrite = TRUE)
terra::writeRaster(struct_stack[["VCI"]], file.path(out_dir, "VCI.tif"), overwrite =TRUE)
terra::writeRaster(struct_stack[["CRR"]], file.path(out_dir, "CRR.tif"), overwrite =TRUE)1.ALS enables explicit mapping of forest structural complexity, not just height or cover.. Metrics like height standard deviation (HSD), vertical complexity index (VCI), and canopy relief ratio (CRR) measure various aspects of canopy structure, offering a richer description than mean height alone.
2.Different complexity metrics capture complementary structural properties. HSD measures vertical variability, VCI reflects the diversity and evenness of height layers, and CRR indicates biomass position within the canopy; together, they distinguish simple from complex, heterogeneous forests.
3.Forest complexity maps are highly sensitive to LiDAR preprocessing and scaling choices.Height normalization, noise filtering, grid resolution, and outlier handling heavily impact complexity metrics, so visual inspection and parameter testing are crucial to confirm ecologically meaningful patterns.