In this exercise, we will generate a wall-to-wall canopy height map for a selected AOI by combining ICESat-2 reference canopy height observations (RH98) with AlphaEarth embedding predictors and terrain covariates from Google Earth Engine (GEE). The workflow demonstrates how sparse spaceborne LiDAR samples can be used to train a machine-learning model and extrapolate canopy height continuously across space.
The primary aim of this hands-on exercise is to produce a wall-to-wall canopy height raster for a selected Area of Interest (AOI) using ICESat-2 as reference data and AlphaEarth embeddings + terrain predictors for spatial extrapolation.
Participants will learn how to: - Use ICESat2VegR to work with ICESat-2 data (ATL03 + ATL08). - Create canopy height reference data by computing RH98 at 20 m segments. - Integrate ICESat-2 reference data with GEE predictors (AlphaEarth embeddings + terrain). - Train and validate a Random Forest model in R. - Produce a wall-to-wall canopy height prediction in GEE and export a GeoTIFF. - This end-to-end workflow is implemented in R and relies on open tools, enabling reproducible canopy height mapping for regional applications.
The exercise follows a sequential model-based mapping workflow. First, ICESat-2 observations are used to derive reference canopy height information (RH98) at 20 m segments by combining the ATL03 photon data with the ATL08 land and vegetation classifications. These segments represent sparse but accurate measurements of canopy height along satellite tracks. Next, environmental predictor variables are obtained from Google Earth Engine, including AlphaEarth embedding layers and terrain attributes. Predictor values are extracted at the ICESat-2 segment locations to build a training dataset linking canopy height observations with explanatory variables. A Random Forest regression model is then trained and validated using the sampled ICESat-2 segments. Finally, the trained model is applied to the predictor mosaic to generate a continuous, wall-to-wall canopy height map across the entire study area. Together, these steps demonstrate how sparse spaceborne LiDAR measurements can be combined with environmental predictors and machine learning to produce spatially continuous forest structure maps.
This document starts at Step 1 of your workflow:
In R Markdown,
rm(list = ls())can break the document if used mid-way.
Since this Rmd starts here, we can safely clean at the beginning.
suppressPackageStartupMessages({
library(reticulate)
library(rgee)
library(sf)
library(terra)
library(data.table)
library(randomForest)
library(dplyr)
library(ggplot2)
library(caret)
library(openxlsx)
library(ICESat2VegR)
library(htmlwidgets)
})
print("Verify the python configuration ...")## [1] "Verify the python configuration ..."
## python: C:/Users/calvi/OneDrive/Documentos/.virtualenvs/r-reticulate/Scripts/python.exe
## libpython: C:/Users/calvi/AppData/Local/r-reticulate/r-reticulate/pyenv/pyenv-win/versions/3.9.13/python39.dll
## pythonhome: C:/Users/calvi/OneDrive/Documentos/.virtualenvs/r-reticulate
## version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)]
## Architecture: 64bit
## numpy: C:/Users/calvi/OneDrive/Documentos/.virtualenvs/r-reticulate/Lib/site-packages/numpy
## numpy_version: 1.24.3
## ee: C:\Users\calvi\OneDrive\DOCUME~1\VIRTUA~1\R-RETI~1\lib\site-packages\ee
##
## NOTE: Python version was forced by VIRTUAL_ENV
# Load your helper functions (must define ee_build_AlphaEarth_embedding_terrain_stack)
#if (!file.exists(BASE_R_PATH)) stop("BASE_R_PATH not found: ", BASE_R_PATH)
source("C:\\Users\\calvi\\OneDrive\\Desktop\\Workshop_ICESat2VegR\\base.R")
if (!exists("ee_build_AlphaEarth_embedding_terrain_stack")) {
stop("Missing ee_build_AlphaEarth_embedding_terrain_stack() after sourcing base.R")
}
print("Initializing Google Earth Engine...")## [1] "Initializing Google Earth Engine..."
reticulate::py_run_string("import ee")
reticulate::py_run_string("ee.Authenticate()")
reticulate::py_run_string('ee.Initialize(project="ee-calvites1990")')
# Load AOI
boundary <- st_read("C:\\Users\\calvi\\OneDrive\\Desktop\\Workshop_ICESat2VegR\\01_InputData\\aoi_4326.geojson")## Reading layer `aoi_4326' from data source
## `C:\Users\calvi\OneDrive\Desktop\Workshop_ICESat2VegR\01_InputData\aoi_4326.geojson'
## using driver `GeoJSON'
## Simple feature collection with 1 feature and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -84.71409 ymin: 30.12047 xmax: -84.6435 ymax: 30.17786
## Geodetic CRS: WGS 84
# Clean AOI geometry for EE
boundary_raw2_valid <- st_make_valid(boundary)
boundary_union <- st_union(boundary_raw2_valid)
boundary_simple <- st_simplify(boundary_union, dTolerance = 0.001)
region <- sf_as_ee(boundary_simple)
boundary_ee <- sf_as_ee(boundary_simple)
geom <- ee$FeatureCollection(boundary_ee)$geometry()
# Output dir
outdir <- "C:\\Users\\calvi\\OneDrive\\Desktop\\Workshop_ICESat2VegR\\02_OutputData"
dir.create(file.path(outdir, "01_STATS"), showWarnings = FALSE, recursive = TRUE)
cat("AOI loaded and EE geometry built.\n")## AOI loaded and EE geometry built.
# Load existing ICESat-2 training segments
data_raw <- read_sf("C:\\Users\\calvi\\OneDrive\\Desktop\\Workshop_ICESat2VegR\\02_OutputData\\02_SEG\\ATL03_ATL08_20250826092226_20.geojson")
class(data_raw)## [1] "sf" "tbl_df" "tbl" "data.frame"
set.seed(1)
df <- dplyr::slice_sample(data_raw, n = 4000, replace = TRUE)
i <- 1
yr <- 2025
# Build predictor stack and extract predictors at 20 m segments
predictors2 <- ee_build_AlphaEarth_embedding_terrain_stack(geom, yr, yr)
sampling_df <- ICESat2VegR::seg_ancillary_extract(predictors2, terra::vect(df), 20)
# Quick check
head(sampling_df)## A00 A01 A02 A03 A04 A05
## <num> <num> <num> <num> <num> <num>
## 1: 0.07972318 -0.11909266 -0.1137409 -0.108512111 -0.18608228 -0.33685506
## 2: 0.14173010 -0.10340638 -0.2069358 0.024605921 -0.13588620 0.07111111
## 3: 0.09356401 -0.06299116 -0.1358862 -0.013840830 -0.21413303 -0.26795848
## 4: 0.10851211 -0.15378700 -0.1600000 0.048227605 -0.14173010 0.00615148
## 5: 0.13588620 -0.01777778 -0.1034064 0.032541330 -0.09842368 0.03543253
## 6: 0.09356401 -0.08421376 -0.1245675 0.007443291 -0.21413303 -0.29287197
## A06 A07 A08 A09 A10 A11
## <num> <num> <num> <num> <num> <num>
## 1: 0.05536332 -0.051733948 -0.11909266 0.035432526 0.12456747 -0.051733948
## 2: 0.01384083 0.084213764 0.04158401 -0.022206844 0.03254133 0.103406382
## 3: 0.07972318 0.022206844 -0.06299116 0.013840830 0.08421376 0.008858131
## 4: -0.07972318 -0.015747789 0.02712803 -0.071111111 -0.02220684 0.103406382
## 5: -0.03844675 -0.008858131 0.08421376 -0.048227605 -0.04158401 0.141730104
## 6: 0.05173395 -0.006151480 -0.07535563 0.003014225 0.10340638 -0.015747789
## A12 A13 A14 A15 A16 A17
## <num> <num> <num> <num> <num> <num>
## 1: -0.1085121 -0.07972318 -0.1245674740 -0.07111111 -0.1663360 -0.12456747
## 2: -0.1417301 0.17937716 0.0797231834 -0.04822760 -0.1793772 -0.10340638
## 3: -0.1476970 -0.02712803 -0.0005536332 -0.08421376 -0.2441522 -0.11374087
## 4: -0.1476970 0.07972318 0.0629911572 -0.05536332 -0.2141330 -0.08882737
## 5: -0.1190927 0.15378700 0.0482276048 -0.02977316 -0.1929104 -0.09842368
## 6: -0.1537870 -0.05911572 -0.0448442907 -0.04822760 -0.2288966 -0.15378700
## A18 A19 A20 A21 A22 A23
## <num> <num> <num> <num> <num> <num>
## 1: -0.153787005 -0.019930796 -0.10340638 0.113740869 -0.17937716 0.13016532
## 2: 0.186082276 -0.012056901 -0.04158401 -0.075355632 0.05173395 -0.05911572
## 3: -0.003936947 -0.029773164 -0.07972318 0.041584006 -0.11909266 0.02977316
## 4: 0.153787005 -0.029773164 -0.07111111 -0.024605921 0.01384083 -0.04822760
## 5: 0.108512111 -0.079723183 -0.14173010 -0.004982699 -0.01205690 0.00615148
## 6: -0.062991157 -0.008858131 -0.11909266 0.066989619 -0.15378700 0.04822760
## A24 A25 A26 A27 A28 A29
## <num> <num> <num> <num> <num> <num>
## 1: 0.04822760 0.2214533 -0.06299116 0.05173395 0.05173395 0.1537870
## 2: -0.04822760 0.2844444 -0.02460592 0.06698962 -0.02460592 0.1663360
## 3: 0.03844675 0.2599000 -0.02460592 0.05536332 0.01384083 0.1476970
## 4: -0.17937716 0.3188927 0.01574779 0.08882737 0.04822760 0.2141330
## 5: -0.19291042 0.2679585 -0.04158401 0.07535563 0.06299116 0.2679585
## 6: 0.02977316 0.2441522 -0.05173395 0.04484429 0.02977316 0.1929104
## A30 A31 A32 A33 A34 A35
## <num> <num> <num> <num> <num> <num>
## 1: 0.08882737 -0.06698962 -0.1793772 0.01384083 0.00615148 -0.17937716
## 2: 0.20693579 0.07972318 -0.1663360 0.03844675 -0.15378700 0.03844675
## 3: 0.17279508 -0.01039600 -0.1929104 0.02460592 -0.10340638 -0.07111111
## 4: 0.17937716 0.04158401 -0.1537870 0.07111111 -0.11374087 0.01039600
## 5: 0.18608228 0.09842368 -0.1190927 0.05911572 -0.14769704 0.10851211
## 6: 0.11909266 0.01574779 -0.1727951 0.05536332 -0.07535563 -0.09356401
## A36 A37 A38 A39 A40 A41
## <num> <num> <num> <num> <num> <num>
## 1: -0.214133026 -0.03543253 0.1301653 -0.30142253 -0.09356401 0.04484429
## 2: -0.199861592 -0.01777778 0.2214533 -0.04158401 -0.15378700 0.07111111
## 3: -0.179377163 -0.06299116 0.1301653 -0.25990004 -0.14769704 0.09842368
## 4: 0.004982699 -0.02460592 0.1727951 -0.03543253 -0.04484429 0.10851211
## 5: -0.051733948 0.03844675 0.1600000 -0.05173395 0.01777778 0.18608228
## 6: -0.172795079 -0.04158401 0.1137409 -0.27613995 -0.13588620 0.11374087
## A42 A43 A44 A45 A46 A47
## <num> <num> <num> <num> <num> <num>
## 1: 0.1663360 -0.088827374 0.1137409 -0.004982699 0.108512111 -0.2364629
## 2: 0.2141330 -0.002214533 0.1137409 0.055363322 -0.003014225 -0.1085121
## 3: 0.1727951 -0.035432526 0.1476970 0.006151480 0.113740869 -0.1793772
## 4: 0.2761399 0.075355632 0.1537870 0.055363322 -0.062991157 -0.1301653
## 5: 0.2288966 -0.006151480 0.2069358 0.051733948 -0.098423683 -0.1929104
## 6: 0.1417301 -0.032541330 0.1417301 0.004982699 0.119092657 -0.2364629
## A48 A49 A50 A51 A52 A53
## <num> <num> <num> <num> <num> <num>
## 1: 0.2141330 -0.01993080 -0.1537870 -0.1358862 -0.04822760 0.038446751
## 2: 0.1034064 -0.13016532 -0.1663360 -0.2519646 -0.04158401 0.062991157
## 3: 0.2679585 -0.11909266 -0.1929104 -0.2214533 -0.03254133 0.088827374
## 4: 0.1929104 -0.19291042 -0.2141330 -0.2441522 -0.05911572 0.022206844
## 5: 0.1301653 -0.23646290 -0.2069358 -0.1417301 -0.02712803 -0.002214533
## 6: 0.2519646 -0.07535563 -0.1998616 -0.1860823 -0.07535563 0.088827374
## A54 A55 A56 A57 A58 A59
## <num> <num> <num> <num> <num> <num>
## 1: 0.08882737 0.093564014 0.03254133 -0.003936947 0.06698962 -0.05173395
## 2: 0.16633602 0.088827374 -0.13588620 0.236462899 0.14769704 -0.17937716
## 3: 0.09842368 0.119092657 -0.01039600 0.088827374 0.07972318 -0.12456747
## 4: 0.19291042 0.003014225 -0.10851211 0.172795079 0.13588620 -0.19291042
## 5: 0.14769704 0.055363322 -0.13588620 0.172795079 0.20693579 -0.19986159
## 6: 0.08882737 0.088827374 0.01039600 0.041584006 0.10340638 -0.09842368
## A60 A61 A62 A63 aspect beam elevation
## <num> <num> <num> <num> <num> <char> <num>
## 1: 0.032541330 -0.11909266 -0.06299116 0.10851211 936.1740 gt1l 2.963774
## 2: 0.088827374 -0.05536332 0.11374087 0.03254133 2343.6733 gt1r 6.954786
## 3: -0.013840830 -0.17937716 -0.03543253 0.09356401 1385.0271 gt1r 3.566396
## 4: 0.075355632 -0.03844675 0.02460592 0.06698962 3296.8274 gt1l 12.364920
## 5: -0.002214533 -0.01777778 0.07111111 0.04822760 258.9939 gt1l 12.045389
## 6: -0.006151480 -0.16633602 -0.05911572 0.09356401 672.9902 gt1l 3.080051
## lat latitude lon longitude mean_solar n_canopy_total n_ground
## <num> <num> <num> <num> <num> <int> <int>
## 1: 30.15402 30.15402 -84.67834 -84.67831 -21.63491 11 0
## 2: 30.14054 30.14063 -84.68068 -84.68071 -21.64105 21 0
## 3: 30.14755 30.14746 -84.67996 -84.67995 -21.63772 27 0
## 4: 30.12294 30.12298 -84.68176 -84.68178 -21.65006 3 0
## 5: 30.12869 30.12861 -84.68122 -84.68115 -21.64730 1 0
## 6: 30.14360 30.14360 -84.67942 -84.67947 -21.64000 3 0
## n_mid_canopy n_top_canopy night_flag2 rh98 segment_id slope year
## <int> <int> <int> <num> <int> <num> <int>
## 1: 10 1 1 14.7535931 163810 2.439643 2025
## 2: 14 7 1 34.2744095 332590 4.166642 2025
## 3: 21 6 1 21.3566180 332552 10.272169 2025
## 4: 3 0 1 14.3197423 163983 5.123228 2025
## 5: 1 0 1 0.5029621 163952 2.330704 2025
## 6: 2 1 1 16.9825489 163868 4.446283 2025
## [1] "A00" "A01" "A02" "A03"
## [5] "A04" "A05" "A06" "A07"
## [9] "A08" "A09" "A10" "A11"
## [13] "A12" "A13" "A14" "A15"
## [17] "A16" "A17" "A18" "A19"
## [21] "A20" "A21" "A22" "A23"
## [25] "A24" "A25" "A26" "A27"
## [29] "A28" "A29" "A30" "A31"
## [33] "A32" "A33" "A34" "A35"
## [37] "A36" "A37" "A38" "A39"
## [41] "A40" "A41" "A42" "A43"
## [45] "A44" "A45" "A46" "A47"
## [49] "A48" "A49" "A50" "A51"
## [53] "A52" "A53" "A54" "A55"
## [57] "A56" "A57" "A58" "A59"
## [61] "A60" "A61" "A62" "A63"
## [65] "aspect" "beam" "elevation" "lat"
## [69] "latitude" "lon" "longitude" "mean_solar"
## [73] "n_canopy_total" "n_ground" "n_mid_canopy" "n_top_canopy"
## [77] "night_flag2" "rh98" "segment_id" "slope"
## [81] "year"
x <- sampling_df %>% dplyr::select(starts_with("A"), "slope", "aspect", "elevation") %>% data.frame()
y <- sampling_df %>% dplyr::select("rh98") %>% data.frame()
sel_rf_rfe <- ICESat2VegR::varSel(x, y$rh98)
imp <- sel_rf_rfe$importance %>%
mutate(selected = parameter %in% sel_rf_rfe$selvars) %>%
arrange(importance) %>%
mutate(parameter = factor(parameter, levels = parameter))
ggplot(imp, aes(x = importance, y = parameter, shape = selected)) +
geom_point(size = 3) +
labs(
x = paste0("Importance (", sel_rf_rfe$scaling, " scale)"),
y = NULL,
title = "Variable importance (scaled)",
subtitle = "Selected variables highlighted"
) +
theme_minimal(base_size = 12)best_metrics <- sel_rf_rfe$importance$parameter[sel_rf_rfe$importance$importance >= 0.2]
length(best_metrics)## [1] 21
x <- dplyr::select(sampling_df, dplyr::any_of(best_metrics))
y <- sampling_df$rh98
data_i <- data.frame(y, x, check.names = FALSE)
idx_train <- createDataPartition(y = data_i$y, p = 0.7, list = FALSE)
trainData <- data_i[idx_train, , drop = FALSE]
testData <- data_i[-idx_train, , drop = FALSE]
x_train <- dplyr::select(trainData, dplyr::any_of(best_metrics))
y_train <- trainData$y
x_test <- dplyr::select(testData, dplyr::any_of(best_metrics))
y_test <- testData$y
fit_rf <- ICESat2VegR::fit_model(
x = x_train,
y = y_train,
rf_args = list(ntree = 100)
)
rf_model <- fit_rf$model
pred_train <- predict(rf_model, newdata = x_train)
pred_test <- predict(rf_model, newdata = x_test)
results_train <- compute_stats(y_train, as.numeric(pred_train))
results_test <- compute_stats(y_test, as.numeric(pred_test))
dataset_train <- data.frame(part = "train", obs = y_train, pred = as.numeric(pred_train))
dataset_test <- data.frame(part = "test", obs = y_test, pred = as.numeric(pred_test))
preds <- rbind(dataset_train, dataset_test)
out_list <- list(
results_train = results_train,
results_test = results_test,
preds = preds
)
OUTDIR <- "C:/Users/calvi/OneDrive/Desktop/Workshop_ICESat2VegR/02_OutputData"
outdir_STATS <- file.path(OUTDIR, "01_STATS")
dir.create(outdir_STATS, showWarnings = FALSE, recursive = TRUE)
write.xlsx(
out_list,
file = file.path(outdir_STATS, "rf_accuracy.xlsx"),
overwrite = TRUE
)
cat("Accuracy exported to:", outdir_STATS, "\n")## Accuracy exported to: C:/Users/calvi/OneDrive/Desktop/Workshop_ICESat2VegR/02_OutputData/01_STATS
## r2 rmse rmse_pct
## 1 0.9556868 1.298237 6.234385
gee_rf_model <- suppressMessages(ICESat2VegR:::build_ee_forest(rf_model))
yr <- 2025
mosaic <- ee_build_AlphaEarth_embedding_terrain_stack(geom, yr, yr)
ch_map <- mosaic$classify(gee_rf_model)$rename("CH")$unmask(0)
mm <- ch_map$reduceRegion(
reducer = ee$Reducer$minMax(),
geometry = geom,
scale = 25,
maxPixels = 1e9
)$getInfo()
print(mm)## $CH_max
## [1] 34.65489
##
## $CH_min
## [1] 1.094095
ch_palette <- c(
"#440154FF","#482878FF","#3E4989FF","#31688EFF","#26828EFF",
"#1F9E89FF","#35B779FF","#6DCD59FF","#B4DE2CFF","#FDE725FF"
)
m <- ICESat2VegR::map_view(
layers = list(
list(
type = "ee_image",
x = ch_map,
bands = "CH",
aoi = geom,
group = "Canopy Height",
min_value = 0,
max_value = 40,
palette = ch_palette,
legend = list(title = "Canopy Height (m)", auto = "quantile")
)
),
base_tiles = "Carto.Light",
add_layers_control = TRUE,
add_opacity_controls = TRUE,
fit_to = geom
)
mOUTDIR <- "C:/Users/calvi/OneDrive/Desktop/Workshop_ICESat2VegR/02_OutputData"
MAP_HTML <- file.path(OUTDIR, "canopy_height_map.html")
htmlwidgets::saveWidget(m, MAP_HTML, selfcontained = TRUE)
cat("Interactive HTML saved:", MAP_HTML, "\n")## Interactive HTML saved: C:/Users/calvi/OneDrive/Desktop/Workshop_ICESat2VegR/02_OutputData/canopy_height_map.html
# map_download(
# ch_map,
# method = c("drive"))
#
cat("GeoTIFF export started to Google Drive: ...", "\n")## GeoTIFF export started to Google Drive: ...