Accessing ICESat-2 Data#
Learning Objectives
Use
icepyx
to search, download, and read ICESat-2 granulesUse
sliderule
to get GeoDataFrames of ICESat-2 dataUse
h5coro
to directly read ICESat-2 granules in an S3 bucket
Part 1: icepyx#
icepyx
is a community and software library for searching, downloading, and reading ICESat-2 data. While opening data should be straightforward, there are some oddities in navigating the highly nested organization and hundreds of variables of the ICESat-2 data. icepyx
provides tools to help with those oddities.
icepyx
was started and initially developed by Jessica Scheick to provide easy programmatic access to ICESat-2 data (before earthaccess
existed!) and facilitate collaborative development around ICESat-2 data products, including training, skill building, and support around practicing open science and contributing to open-source software. Thanks to contributions from countless community members, icepyx
can (for ICESat-2 data):
search for available data granules (data files)
order and download data or access it directly in the cloud
order a subset of data: clipped in space, time, containing fewer variables, or a few other options provided by NSIDC
search through the available ICESat-2 data variables
read ICESat-2 data into xarray DataArrays, including merging data from multiple files
Under the hood, icepyx
relies on earthaccess
to help handle authentication, especially for obtaining S3 tokens to access ICESat-2 data in the cloud. All this happens without the user needing to take any action other than supplying their Earthdata Login credentials using one of the methods described in the earthaccess tutorial.
Credit#
This part of the notebook is based on an icepyx Tutorial originally created by Rachel Wegener, Univ. Maryland and updated by Amy Steiker, NSIDC, and Jessica Scheick, Univ. of New Hampshire. It was updated in May 2024 to utilize (at a minimum) v1.0.0 of icepyx.
For the original notebook, which includes additional examples and information, see: https://book.cryointhecloud.com/tutorials/NASA-Earthdata-Cloud-Access/4.icepyx.html
For more information#
GitHub: icesat2py/icepyx
Documentation: https://icepyx.readthedocs.io/en/latest/
Prerequisites#
An Earth Data Login account.
A .netrc file, that contains your Earthdata Login credentials, in your home directory.
import icepyx as ipx
import json
import math
import warnings
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from shapely.geometry import shape, GeometryCollection
Example 1: Search and Download ATL08 Granule#
# Open a geojson of our area of interest
with open("./grandmesa.geojson") as f:
features = json.load(f)["features"]
grandmesa = GeometryCollection([shape(feature["geometry"]).buffer(0) for feature in features])
grandmesa
# Use our search parameters to setup a search Query
short_name = 'ATL08'
spatial_extent = list(grandmesa.bounds)
date_range = ['2019-12-01','2019-12-12']
region = ipx.Query(short_name, spatial_extent, date_range)
# Display if any data files, or granules, matched our search
region.avail_granules(ids=True)
[['ATL08_20191211143520_11560506_006_01.h5']]
# We can also get the S3 urls
print(region.avail_granules(ids=True, cloud=True))
s3urls = region.avail_granules(ids=True, cloud=True)[1]
[['ATL08_20191211143520_11560506_006_01.h5'], ['s3://nsidc-cumulus-prod-protected/ATLAS/ATL08/006/2019/12/11/ATL08_20191211143520_11560506_006_01.h5']]
# Download the granules to a into a folder called 'bosque_primavera_ATL08'
region.download_granules('/tmp/grandmesa_ATL08')
Total number of data order requests is 1 for 1 granules.
Data request 1 of 1 is submitting to NSIDC
order ID: 5000005727224
Initial status of your order request at NSIDC is: processing
Your order status is still processing at NSIDC. Please continue waiting... this may take a few moments.
Your order is: complete
Beginning download of zipped output...
Data request 5000005727224 of 1 order(s) is downloaded.
Download complete
Example 2: Reading a Granule with icepyx#
To read a file with icepyx there are several steps:
Create a
Read
object. This sets up an initial connection to your file(s) and validates the metadata.Tell the
Read
object what variables you would like to readLoad your data!
Create a Read
object#
# access the file you've downloaded
reader = ipx.Read('/tmp/grandmesa_ATL08')
reader
<icepyx.core.read.Read at 0x7f1eb3fb7050>
Explore your variables#
reader.vars.avail()
Show code cell output
['ancillary_data/atlas_sdp_gps_epoch',
'ancillary_data/control',
'ancillary_data/data_end_utc',
'ancillary_data/data_start_utc',
'ancillary_data/end_cycle',
'ancillary_data/end_delta_time',
'ancillary_data/end_geoseg',
'ancillary_data/end_gpssow',
'ancillary_data/end_gpsweek',
'ancillary_data/end_orbit',
'ancillary_data/end_region',
'ancillary_data/end_rgt',
'ancillary_data/granule_end_utc',
'ancillary_data/granule_start_utc',
'ancillary_data/land/atl08_region',
'ancillary_data/land/bin_size_h',
'ancillary_data/land/bin_size_n',
'ancillary_data/land/bright_thresh',
'ancillary_data/land/ca_class',
'ancillary_data/land/can_noise_thresh',
'ancillary_data/land/can_stat_thresh',
'ancillary_data/land/canopy20m_thresh',
'ancillary_data/land/canopy_flag_switch',
'ancillary_data/land/canopy_seg',
'ancillary_data/land/class_thresh',
'ancillary_data/land/cloud_filter_switch',
'ancillary_data/land/del_amp',
'ancillary_data/land/del_mu',
'ancillary_data/land/del_sigma',
'ancillary_data/land/dem_filter_switch',
'ancillary_data/land/dem_removal_percent_limit',
'ancillary_data/land/dragann_switch',
'ancillary_data/land/dseg',
'ancillary_data/land/dseg_buf',
'ancillary_data/land/fnlgnd_filter_switch',
'ancillary_data/land/gnd_stat_thresh',
'ancillary_data/land/gthresh_factor',
'ancillary_data/land/h_canopy_perc',
'ancillary_data/land/iter_gnd',
'ancillary_data/land/iter_max',
'ancillary_data/land/lseg',
'ancillary_data/land/lseg_buf',
'ancillary_data/land/lw_filt_bnd',
'ancillary_data/land/lw_gnd_bnd',
'ancillary_data/land/lw_toc_bnd',
'ancillary_data/land/lw_toc_cut',
'ancillary_data/land/max_atl03files',
'ancillary_data/land/max_atl09files',
'ancillary_data/land/max_peaks',
'ancillary_data/land/max_try',
'ancillary_data/land/min_nphs',
'ancillary_data/land/n_dec_mode',
'ancillary_data/land/night_thresh',
'ancillary_data/land/noise_class',
'ancillary_data/land/outlier_filter_switch',
'ancillary_data/land/p_static',
'ancillary_data/land/ph_removal_percent_limit',
'ancillary_data/land/proc_geoseg',
'ancillary_data/land/psf',
'ancillary_data/land/ref_dem_limit',
'ancillary_data/land/ref_finalground_limit',
'ancillary_data/land/relief_hbot',
'ancillary_data/land/relief_htop',
'ancillary_data/land/shp_param',
'ancillary_data/land/sig_rsq_search',
'ancillary_data/land/sseg',
'ancillary_data/land/stat20m_thresh',
'ancillary_data/land/stat_thresh',
'ancillary_data/land/tc_thresh',
'ancillary_data/land/te_class',
'ancillary_data/land/terrain20m_thresh',
'ancillary_data/land/toc_class',
'ancillary_data/land/up_filt_bnd',
'ancillary_data/land/up_gnd_bnd',
'ancillary_data/land/up_toc_bnd',
'ancillary_data/land/up_toc_cut',
'ancillary_data/land/yapc_switch',
'ancillary_data/qa_at_interval',
'ancillary_data/release',
'ancillary_data/start_cycle',
'ancillary_data/start_delta_time',
'ancillary_data/start_geoseg',
'ancillary_data/start_gpssow',
'ancillary_data/start_gpsweek',
'ancillary_data/start_orbit',
'ancillary_data/start_region',
'ancillary_data/start_rgt',
'ancillary_data/version',
'ds_geosegments',
'ds_metrics',
'ds_surf_type',
'gt1l/land_segments/asr',
'gt1l/land_segments/atlas_pa',
'gt1l/land_segments/beam_azimuth',
'gt1l/land_segments/beam_coelev',
'gt1l/land_segments/brightness_flag',
'gt1l/land_segments/canopy/can_noise',
'gt1l/land_segments/canopy/canopy_h_metrics',
'gt1l/land_segments/canopy/canopy_h_metrics_abs',
'gt1l/land_segments/canopy/canopy_openness',
'gt1l/land_segments/canopy/canopy_rh_conf',
'gt1l/land_segments/canopy/centroid_height',
'gt1l/land_segments/canopy/h_canopy',
'gt1l/land_segments/canopy/h_canopy_20m',
'gt1l/land_segments/canopy/h_canopy_abs',
'gt1l/land_segments/canopy/h_canopy_quad',
'gt1l/land_segments/canopy/h_canopy_uncertainty',
'gt1l/land_segments/canopy/h_dif_canopy',
'gt1l/land_segments/canopy/h_max_canopy',
'gt1l/land_segments/canopy/h_max_canopy_abs',
'gt1l/land_segments/canopy/h_mean_canopy',
'gt1l/land_segments/canopy/h_mean_canopy_abs',
'gt1l/land_segments/canopy/h_median_canopy',
'gt1l/land_segments/canopy/h_median_canopy_abs',
'gt1l/land_segments/canopy/h_min_canopy',
'gt1l/land_segments/canopy/h_min_canopy_abs',
'gt1l/land_segments/canopy/n_ca_photons',
'gt1l/land_segments/canopy/n_toc_photons',
'gt1l/land_segments/canopy/photon_rate_can',
'gt1l/land_segments/canopy/photon_rate_can_nr',
'gt1l/land_segments/canopy/segment_cover',
'gt1l/land_segments/canopy/subset_can_flag',
'gt1l/land_segments/canopy/toc_roughness',
'gt1l/land_segments/cloud_flag_atm',
'gt1l/land_segments/cloud_fold_flag',
'gt1l/land_segments/delta_time',
'gt1l/land_segments/delta_time_beg',
'gt1l/land_segments/delta_time_end',
'gt1l/land_segments/dem_flag',
'gt1l/land_segments/dem_h',
'gt1l/land_segments/dem_removal_flag',
'gt1l/land_segments/h_dif_ref',
'gt1l/land_segments/last_seg_extend',
'gt1l/land_segments/latitude',
'gt1l/land_segments/latitude_20m',
'gt1l/land_segments/layer_flag',
'gt1l/land_segments/longitude',
'gt1l/land_segments/longitude_20m',
'gt1l/land_segments/msw_flag',
'gt1l/land_segments/n_seg_ph',
'gt1l/land_segments/night_flag',
'gt1l/land_segments/ph_ndx_beg',
'gt1l/land_segments/ph_removal_flag',
'gt1l/land_segments/psf_flag',
'gt1l/land_segments/rgt',
'gt1l/land_segments/sat_flag',
'gt1l/land_segments/segment_id_beg',
'gt1l/land_segments/segment_id_end',
'gt1l/land_segments/segment_landcover',
'gt1l/land_segments/segment_snowcover',
'gt1l/land_segments/segment_watermask',
'gt1l/land_segments/sigma_across',
'gt1l/land_segments/sigma_along',
'gt1l/land_segments/sigma_atlas_land',
'gt1l/land_segments/sigma_h',
'gt1l/land_segments/sigma_topo',
'gt1l/land_segments/snr',
'gt1l/land_segments/solar_azimuth',
'gt1l/land_segments/solar_elevation',
'gt1l/land_segments/surf_type',
'gt1l/land_segments/terrain/h_te_best_fit',
'gt1l/land_segments/terrain/h_te_best_fit_20m',
'gt1l/land_segments/terrain/h_te_interp',
'gt1l/land_segments/terrain/h_te_max',
'gt1l/land_segments/terrain/h_te_mean',
'gt1l/land_segments/terrain/h_te_median',
'gt1l/land_segments/terrain/h_te_min',
'gt1l/land_segments/terrain/h_te_mode',
'gt1l/land_segments/terrain/h_te_rh25',
'gt1l/land_segments/terrain/h_te_skew',
'gt1l/land_segments/terrain/h_te_std',
'gt1l/land_segments/terrain/h_te_uncertainty',
'gt1l/land_segments/terrain/n_te_photons',
'gt1l/land_segments/terrain/photon_rate_te',
'gt1l/land_segments/terrain/subset_te_flag',
'gt1l/land_segments/terrain/terrain_slope',
'gt1l/land_segments/terrain_flg',
'gt1l/land_segments/urban_flag',
'gt1l/signal_photons/classed_pc_flag',
'gt1l/signal_photons/classed_pc_indx',
'gt1l/signal_photons/d_flag',
'gt1l/signal_photons/delta_time',
'gt1l/signal_photons/ph_h',
'gt1l/signal_photons/ph_segment_id',
'gt1r/land_segments/asr',
'gt1r/land_segments/atlas_pa',
'gt1r/land_segments/beam_azimuth',
'gt1r/land_segments/beam_coelev',
'gt1r/land_segments/brightness_flag',
'gt1r/land_segments/canopy/can_noise',
'gt1r/land_segments/canopy/canopy_h_metrics',
'gt1r/land_segments/canopy/canopy_h_metrics_abs',
'gt1r/land_segments/canopy/canopy_openness',
'gt1r/land_segments/canopy/canopy_rh_conf',
'gt1r/land_segments/canopy/centroid_height',
'gt1r/land_segments/canopy/h_canopy',
'gt1r/land_segments/canopy/h_canopy_20m',
'gt1r/land_segments/canopy/h_canopy_abs',
'gt1r/land_segments/canopy/h_canopy_quad',
'gt1r/land_segments/canopy/h_canopy_uncertainty',
'gt1r/land_segments/canopy/h_dif_canopy',
'gt1r/land_segments/canopy/h_max_canopy',
'gt1r/land_segments/canopy/h_max_canopy_abs',
'gt1r/land_segments/canopy/h_mean_canopy',
'gt1r/land_segments/canopy/h_mean_canopy_abs',
'gt1r/land_segments/canopy/h_median_canopy',
'gt1r/land_segments/canopy/h_median_canopy_abs',
'gt1r/land_segments/canopy/h_min_canopy',
'gt1r/land_segments/canopy/h_min_canopy_abs',
'gt1r/land_segments/canopy/n_ca_photons',
'gt1r/land_segments/canopy/n_toc_photons',
'gt1r/land_segments/canopy/photon_rate_can',
'gt1r/land_segments/canopy/photon_rate_can_nr',
'gt1r/land_segments/canopy/segment_cover',
'gt1r/land_segments/canopy/subset_can_flag',
'gt1r/land_segments/canopy/toc_roughness',
'gt1r/land_segments/cloud_flag_atm',
'gt1r/land_segments/cloud_fold_flag',
'gt1r/land_segments/delta_time',
'gt1r/land_segments/delta_time_beg',
'gt1r/land_segments/delta_time_end',
'gt1r/land_segments/dem_flag',
'gt1r/land_segments/dem_h',
'gt1r/land_segments/dem_removal_flag',
'gt1r/land_segments/h_dif_ref',
'gt1r/land_segments/last_seg_extend',
'gt1r/land_segments/latitude',
'gt1r/land_segments/latitude_20m',
'gt1r/land_segments/layer_flag',
'gt1r/land_segments/longitude',
'gt1r/land_segments/longitude_20m',
'gt1r/land_segments/msw_flag',
'gt1r/land_segments/n_seg_ph',
'gt1r/land_segments/night_flag',
'gt1r/land_segments/ph_ndx_beg',
'gt1r/land_segments/ph_removal_flag',
'gt1r/land_segments/psf_flag',
'gt1r/land_segments/rgt',
'gt1r/land_segments/sat_flag',
'gt1r/land_segments/segment_id_beg',
'gt1r/land_segments/segment_id_end',
'gt1r/land_segments/segment_landcover',
'gt1r/land_segments/segment_snowcover',
'gt1r/land_segments/segment_watermask',
'gt1r/land_segments/sigma_across',
'gt1r/land_segments/sigma_along',
'gt1r/land_segments/sigma_atlas_land',
'gt1r/land_segments/sigma_h',
'gt1r/land_segments/sigma_topo',
'gt1r/land_segments/snr',
'gt1r/land_segments/solar_azimuth',
'gt1r/land_segments/solar_elevation',
'gt1r/land_segments/surf_type',
'gt1r/land_segments/terrain/h_te_best_fit',
'gt1r/land_segments/terrain/h_te_best_fit_20m',
'gt1r/land_segments/terrain/h_te_interp',
'gt1r/land_segments/terrain/h_te_max',
'gt1r/land_segments/terrain/h_te_mean',
'gt1r/land_segments/terrain/h_te_median',
'gt1r/land_segments/terrain/h_te_min',
'gt1r/land_segments/terrain/h_te_mode',
'gt1r/land_segments/terrain/h_te_rh25',
'gt1r/land_segments/terrain/h_te_skew',
'gt1r/land_segments/terrain/h_te_std',
'gt1r/land_segments/terrain/h_te_uncertainty',
'gt1r/land_segments/terrain/n_te_photons',
'gt1r/land_segments/terrain/photon_rate_te',
'gt1r/land_segments/terrain/subset_te_flag',
'gt1r/land_segments/terrain/terrain_slope',
'gt1r/land_segments/terrain_flg',
'gt1r/land_segments/urban_flag',
'gt1r/signal_photons/classed_pc_flag',
'gt1r/signal_photons/classed_pc_indx',
'gt1r/signal_photons/d_flag',
'gt1r/signal_photons/delta_time',
'gt1r/signal_photons/ph_h',
'gt1r/signal_photons/ph_segment_id',
'gt2l/land_segments/asr',
'gt2l/land_segments/atlas_pa',
'gt2l/land_segments/beam_azimuth',
'gt2l/land_segments/beam_coelev',
'gt2l/land_segments/brightness_flag',
'gt2l/land_segments/canopy/can_noise',
'gt2l/land_segments/canopy/canopy_h_metrics',
'gt2l/land_segments/canopy/canopy_h_metrics_abs',
'gt2l/land_segments/canopy/canopy_openness',
'gt2l/land_segments/canopy/canopy_rh_conf',
'gt2l/land_segments/canopy/centroid_height',
'gt2l/land_segments/canopy/h_canopy',
'gt2l/land_segments/canopy/h_canopy_20m',
'gt2l/land_segments/canopy/h_canopy_abs',
'gt2l/land_segments/canopy/h_canopy_quad',
'gt2l/land_segments/canopy/h_canopy_uncertainty',
'gt2l/land_segments/canopy/h_dif_canopy',
'gt2l/land_segments/canopy/h_max_canopy',
'gt2l/land_segments/canopy/h_max_canopy_abs',
'gt2l/land_segments/canopy/h_mean_canopy',
'gt2l/land_segments/canopy/h_mean_canopy_abs',
'gt2l/land_segments/canopy/h_median_canopy',
'gt2l/land_segments/canopy/h_median_canopy_abs',
'gt2l/land_segments/canopy/h_min_canopy',
'gt2l/land_segments/canopy/h_min_canopy_abs',
'gt2l/land_segments/canopy/n_ca_photons',
'gt2l/land_segments/canopy/n_toc_photons',
'gt2l/land_segments/canopy/photon_rate_can',
'gt2l/land_segments/canopy/photon_rate_can_nr',
'gt2l/land_segments/canopy/segment_cover',
'gt2l/land_segments/canopy/subset_can_flag',
'gt2l/land_segments/canopy/toc_roughness',
'gt2l/land_segments/cloud_flag_atm',
'gt2l/land_segments/cloud_fold_flag',
'gt2l/land_segments/delta_time',
'gt2l/land_segments/delta_time_beg',
'gt2l/land_segments/delta_time_end',
'gt2l/land_segments/dem_flag',
'gt2l/land_segments/dem_h',
'gt2l/land_segments/dem_removal_flag',
'gt2l/land_segments/h_dif_ref',
'gt2l/land_segments/last_seg_extend',
'gt2l/land_segments/latitude',
'gt2l/land_segments/latitude_20m',
'gt2l/land_segments/layer_flag',
'gt2l/land_segments/longitude',
'gt2l/land_segments/longitude_20m',
'gt2l/land_segments/msw_flag',
'gt2l/land_segments/n_seg_ph',
'gt2l/land_segments/night_flag',
'gt2l/land_segments/ph_ndx_beg',
'gt2l/land_segments/ph_removal_flag',
'gt2l/land_segments/psf_flag',
'gt2l/land_segments/rgt',
'gt2l/land_segments/sat_flag',
'gt2l/land_segments/segment_id_beg',
'gt2l/land_segments/segment_id_end',
'gt2l/land_segments/segment_landcover',
'gt2l/land_segments/segment_snowcover',
'gt2l/land_segments/segment_watermask',
'gt2l/land_segments/sigma_across',
'gt2l/land_segments/sigma_along',
'gt2l/land_segments/sigma_atlas_land',
'gt2l/land_segments/sigma_h',
'gt2l/land_segments/sigma_topo',
'gt2l/land_segments/snr',
'gt2l/land_segments/solar_azimuth',
'gt2l/land_segments/solar_elevation',
'gt2l/land_segments/surf_type',
'gt2l/land_segments/terrain/h_te_best_fit',
'gt2l/land_segments/terrain/h_te_best_fit_20m',
'gt2l/land_segments/terrain/h_te_interp',
'gt2l/land_segments/terrain/h_te_max',
'gt2l/land_segments/terrain/h_te_mean',
'gt2l/land_segments/terrain/h_te_median',
'gt2l/land_segments/terrain/h_te_min',
'gt2l/land_segments/terrain/h_te_mode',
'gt2l/land_segments/terrain/h_te_rh25',
'gt2l/land_segments/terrain/h_te_skew',
'gt2l/land_segments/terrain/h_te_std',
'gt2l/land_segments/terrain/h_te_uncertainty',
'gt2l/land_segments/terrain/n_te_photons',
'gt2l/land_segments/terrain/photon_rate_te',
'gt2l/land_segments/terrain/subset_te_flag',
'gt2l/land_segments/terrain/terrain_slope',
'gt2l/land_segments/terrain_flg',
'gt2l/land_segments/urban_flag',
'gt2l/signal_photons/classed_pc_flag',
'gt2l/signal_photons/classed_pc_indx',
'gt2l/signal_photons/d_flag',
'gt2l/signal_photons/delta_time',
'gt2l/signal_photons/ph_h',
'gt2l/signal_photons/ph_segment_id',
'gt2r/land_segments/asr',
'gt2r/land_segments/atlas_pa',
'gt2r/land_segments/beam_azimuth',
'gt2r/land_segments/beam_coelev',
'gt2r/land_segments/brightness_flag',
'gt2r/land_segments/canopy/can_noise',
'gt2r/land_segments/canopy/canopy_h_metrics',
'gt2r/land_segments/canopy/canopy_h_metrics_abs',
'gt2r/land_segments/canopy/canopy_openness',
'gt2r/land_segments/canopy/canopy_rh_conf',
'gt2r/land_segments/canopy/centroid_height',
'gt2r/land_segments/canopy/h_canopy',
'gt2r/land_segments/canopy/h_canopy_20m',
'gt2r/land_segments/canopy/h_canopy_abs',
'gt2r/land_segments/canopy/h_canopy_quad',
'gt2r/land_segments/canopy/h_canopy_uncertainty',
'gt2r/land_segments/canopy/h_dif_canopy',
'gt2r/land_segments/canopy/h_max_canopy',
'gt2r/land_segments/canopy/h_max_canopy_abs',
'gt2r/land_segments/canopy/h_mean_canopy',
'gt2r/land_segments/canopy/h_mean_canopy_abs',
'gt2r/land_segments/canopy/h_median_canopy',
'gt2r/land_segments/canopy/h_median_canopy_abs',
'gt2r/land_segments/canopy/h_min_canopy',
'gt2r/land_segments/canopy/h_min_canopy_abs',
'gt2r/land_segments/canopy/n_ca_photons',
'gt2r/land_segments/canopy/n_toc_photons',
'gt2r/land_segments/canopy/photon_rate_can',
'gt2r/land_segments/canopy/photon_rate_can_nr',
'gt2r/land_segments/canopy/segment_cover',
'gt2r/land_segments/canopy/subset_can_flag',
'gt2r/land_segments/canopy/toc_roughness',
'gt2r/land_segments/cloud_flag_atm',
'gt2r/land_segments/cloud_fold_flag',
'gt2r/land_segments/delta_time',
'gt2r/land_segments/delta_time_beg',
'gt2r/land_segments/delta_time_end',
'gt2r/land_segments/dem_flag',
'gt2r/land_segments/dem_h',
'gt2r/land_segments/dem_removal_flag',
'gt2r/land_segments/h_dif_ref',
'gt2r/land_segments/last_seg_extend',
'gt2r/land_segments/latitude',
'gt2r/land_segments/latitude_20m',
'gt2r/land_segments/layer_flag',
'gt2r/land_segments/longitude',
'gt2r/land_segments/longitude_20m',
'gt2r/land_segments/msw_flag',
'gt2r/land_segments/n_seg_ph',
'gt2r/land_segments/night_flag',
'gt2r/land_segments/ph_ndx_beg',
'gt2r/land_segments/ph_removal_flag',
'gt2r/land_segments/psf_flag',
'gt2r/land_segments/rgt',
'gt2r/land_segments/sat_flag',
'gt2r/land_segments/segment_id_beg',
'gt2r/land_segments/segment_id_end',
'gt2r/land_segments/segment_landcover',
'gt2r/land_segments/segment_snowcover',
'gt2r/land_segments/segment_watermask',
'gt2r/land_segments/sigma_across',
'gt2r/land_segments/sigma_along',
'gt2r/land_segments/sigma_atlas_land',
'gt2r/land_segments/sigma_h',
'gt2r/land_segments/sigma_topo',
'gt2r/land_segments/snr',
'gt2r/land_segments/solar_azimuth',
'gt2r/land_segments/solar_elevation',
'gt2r/land_segments/surf_type',
'gt2r/land_segments/terrain/h_te_best_fit',
'gt2r/land_segments/terrain/h_te_best_fit_20m',
'gt2r/land_segments/terrain/h_te_interp',
'gt2r/land_segments/terrain/h_te_max',
'gt2r/land_segments/terrain/h_te_mean',
'gt2r/land_segments/terrain/h_te_median',
'gt2r/land_segments/terrain/h_te_min',
'gt2r/land_segments/terrain/h_te_mode',
'gt2r/land_segments/terrain/h_te_rh25',
'gt2r/land_segments/terrain/h_te_skew',
'gt2r/land_segments/terrain/h_te_std',
'gt2r/land_segments/terrain/h_te_uncertainty',
'gt2r/land_segments/terrain/n_te_photons',
'gt2r/land_segments/terrain/photon_rate_te',
'gt2r/land_segments/terrain/subset_te_flag',
'gt2r/land_segments/terrain/terrain_slope',
'gt2r/land_segments/terrain_flg',
'gt2r/land_segments/urban_flag',
'gt2r/signal_photons/classed_pc_flag',
'gt2r/signal_photons/classed_pc_indx',
'gt2r/signal_photons/d_flag',
'gt2r/signal_photons/delta_time',
'gt2r/signal_photons/ph_h',
'gt2r/signal_photons/ph_segment_id',
'gt3l/land_segments/asr',
'gt3l/land_segments/atlas_pa',
'gt3l/land_segments/beam_azimuth',
'gt3l/land_segments/beam_coelev',
'gt3l/land_segments/brightness_flag',
'gt3l/land_segments/canopy/can_noise',
'gt3l/land_segments/canopy/canopy_h_metrics',
'gt3l/land_segments/canopy/canopy_h_metrics_abs',
'gt3l/land_segments/canopy/canopy_openness',
'gt3l/land_segments/canopy/canopy_rh_conf',
'gt3l/land_segments/canopy/centroid_height',
'gt3l/land_segments/canopy/h_canopy',
'gt3l/land_segments/canopy/h_canopy_20m',
'gt3l/land_segments/canopy/h_canopy_abs',
'gt3l/land_segments/canopy/h_canopy_quad',
'gt3l/land_segments/canopy/h_canopy_uncertainty',
'gt3l/land_segments/canopy/h_dif_canopy',
'gt3l/land_segments/canopy/h_max_canopy',
'gt3l/land_segments/canopy/h_max_canopy_abs',
'gt3l/land_segments/canopy/h_mean_canopy',
'gt3l/land_segments/canopy/h_mean_canopy_abs',
'gt3l/land_segments/canopy/h_median_canopy',
'gt3l/land_segments/canopy/h_median_canopy_abs',
'gt3l/land_segments/canopy/h_min_canopy',
'gt3l/land_segments/canopy/h_min_canopy_abs',
'gt3l/land_segments/canopy/n_ca_photons',
'gt3l/land_segments/canopy/n_toc_photons',
'gt3l/land_segments/canopy/photon_rate_can',
'gt3l/land_segments/canopy/photon_rate_can_nr',
'gt3l/land_segments/canopy/segment_cover',
'gt3l/land_segments/canopy/subset_can_flag',
'gt3l/land_segments/canopy/toc_roughness',
'gt3l/land_segments/cloud_flag_atm',
'gt3l/land_segments/cloud_fold_flag',
'gt3l/land_segments/delta_time',
'gt3l/land_segments/delta_time_beg',
'gt3l/land_segments/delta_time_end',
'gt3l/land_segments/dem_flag',
'gt3l/land_segments/dem_h',
'gt3l/land_segments/dem_removal_flag',
'gt3l/land_segments/h_dif_ref',
'gt3l/land_segments/last_seg_extend',
'gt3l/land_segments/latitude',
'gt3l/land_segments/latitude_20m',
'gt3l/land_segments/layer_flag',
'gt3l/land_segments/longitude',
'gt3l/land_segments/longitude_20m',
'gt3l/land_segments/msw_flag',
'gt3l/land_segments/n_seg_ph',
'gt3l/land_segments/night_flag',
'gt3l/land_segments/ph_ndx_beg',
'gt3l/land_segments/ph_removal_flag',
'gt3l/land_segments/psf_flag',
'gt3l/land_segments/rgt',
'gt3l/land_segments/sat_flag',
'gt3l/land_segments/segment_id_beg',
'gt3l/land_segments/segment_id_end',
'gt3l/land_segments/segment_landcover',
'gt3l/land_segments/segment_snowcover',
'gt3l/land_segments/segment_watermask',
'gt3l/land_segments/sigma_across',
'gt3l/land_segments/sigma_along',
'gt3l/land_segments/sigma_atlas_land',
'gt3l/land_segments/sigma_h',
'gt3l/land_segments/sigma_topo',
'gt3l/land_segments/snr',
'gt3l/land_segments/solar_azimuth',
'gt3l/land_segments/solar_elevation',
'gt3l/land_segments/surf_type',
'gt3l/land_segments/terrain/h_te_best_fit',
'gt3l/land_segments/terrain/h_te_best_fit_20m',
'gt3l/land_segments/terrain/h_te_interp',
'gt3l/land_segments/terrain/h_te_max',
'gt3l/land_segments/terrain/h_te_mean',
'gt3l/land_segments/terrain/h_te_median',
'gt3l/land_segments/terrain/h_te_min',
'gt3l/land_segments/terrain/h_te_mode',
'gt3l/land_segments/terrain/h_te_rh25',
'gt3l/land_segments/terrain/h_te_skew',
'gt3l/land_segments/terrain/h_te_std',
'gt3l/land_segments/terrain/h_te_uncertainty',
'gt3l/land_segments/terrain/n_te_photons',
'gt3l/land_segments/terrain/photon_rate_te',
'gt3l/land_segments/terrain/subset_te_flag',
'gt3l/land_segments/terrain/terrain_slope',
'gt3l/land_segments/terrain_flg',
'gt3l/land_segments/urban_flag',
'gt3l/signal_photons/classed_pc_flag',
'gt3l/signal_photons/classed_pc_indx',
'gt3l/signal_photons/d_flag',
'gt3l/signal_photons/delta_time',
'gt3l/signal_photons/ph_h',
'gt3l/signal_photons/ph_segment_id',
'gt3r/land_segments/asr',
'gt3r/land_segments/atlas_pa',
'gt3r/land_segments/beam_azimuth',
'gt3r/land_segments/beam_coelev',
'gt3r/land_segments/brightness_flag',
'gt3r/land_segments/canopy/can_noise',
'gt3r/land_segments/canopy/canopy_h_metrics',
'gt3r/land_segments/canopy/canopy_h_metrics_abs',
'gt3r/land_segments/canopy/canopy_openness',
'gt3r/land_segments/canopy/canopy_rh_conf',
'gt3r/land_segments/canopy/centroid_height',
'gt3r/land_segments/canopy/h_canopy',
'gt3r/land_segments/canopy/h_canopy_20m',
'gt3r/land_segments/canopy/h_canopy_abs',
'gt3r/land_segments/canopy/h_canopy_quad',
'gt3r/land_segments/canopy/h_canopy_uncertainty',
'gt3r/land_segments/canopy/h_dif_canopy',
'gt3r/land_segments/canopy/h_max_canopy',
'gt3r/land_segments/canopy/h_max_canopy_abs',
'gt3r/land_segments/canopy/h_mean_canopy',
'gt3r/land_segments/canopy/h_mean_canopy_abs',
'gt3r/land_segments/canopy/h_median_canopy',
'gt3r/land_segments/canopy/h_median_canopy_abs',
'gt3r/land_segments/canopy/h_min_canopy',
'gt3r/land_segments/canopy/h_min_canopy_abs',
'gt3r/land_segments/canopy/n_ca_photons',
'gt3r/land_segments/canopy/n_toc_photons',
'gt3r/land_segments/canopy/photon_rate_can',
'gt3r/land_segments/canopy/photon_rate_can_nr',
'gt3r/land_segments/canopy/segment_cover',
'gt3r/land_segments/canopy/subset_can_flag',
'gt3r/land_segments/canopy/toc_roughness',
'gt3r/land_segments/cloud_flag_atm',
'gt3r/land_segments/cloud_fold_flag',
'gt3r/land_segments/delta_time',
'gt3r/land_segments/delta_time_beg',
'gt3r/land_segments/delta_time_end',
'gt3r/land_segments/dem_flag',
'gt3r/land_segments/dem_h',
'gt3r/land_segments/dem_removal_flag',
'gt3r/land_segments/h_dif_ref',
'gt3r/land_segments/last_seg_extend',
'gt3r/land_segments/latitude',
'gt3r/land_segments/latitude_20m',
'gt3r/land_segments/layer_flag',
'gt3r/land_segments/longitude',
'gt3r/land_segments/longitude_20m',
'gt3r/land_segments/msw_flag',
'gt3r/land_segments/n_seg_ph',
'gt3r/land_segments/night_flag',
'gt3r/land_segments/ph_ndx_beg',
'gt3r/land_segments/ph_removal_flag',
'gt3r/land_segments/psf_flag',
'gt3r/land_segments/rgt',
'gt3r/land_segments/sat_flag',
'gt3r/land_segments/segment_id_beg',
'gt3r/land_segments/segment_id_end',
'gt3r/land_segments/segment_landcover',
'gt3r/land_segments/segment_snowcover',
'gt3r/land_segments/segment_watermask',
'gt3r/land_segments/sigma_across',
'gt3r/land_segments/sigma_along',
'gt3r/land_segments/sigma_atlas_land',
'gt3r/land_segments/sigma_h',
'gt3r/land_segments/sigma_topo',
'gt3r/land_segments/snr',
'gt3r/land_segments/solar_azimuth',
'gt3r/land_segments/solar_elevation',
'gt3r/land_segments/surf_type',
'gt3r/land_segments/terrain/h_te_best_fit',
'gt3r/land_segments/terrain/h_te_best_fit_20m',
'gt3r/land_segments/terrain/h_te_interp',
'gt3r/land_segments/terrain/h_te_max',
'gt3r/land_segments/terrain/h_te_mean',
'gt3r/land_segments/terrain/h_te_median',
'gt3r/land_segments/terrain/h_te_min',
'gt3r/land_segments/terrain/h_te_mode',
'gt3r/land_segments/terrain/h_te_rh25',
'gt3r/land_segments/terrain/h_te_skew',
'gt3r/land_segments/terrain/h_te_std',
'gt3r/land_segments/terrain/h_te_uncertainty',
'gt3r/land_segments/terrain/n_te_photons',
'gt3r/land_segments/terrain/photon_rate_te',
'gt3r/land_segments/terrain/subset_te_flag',
'gt3r/land_segments/terrain/terrain_slope',
'gt3r/land_segments/terrain_flg',
'gt3r/land_segments/urban_flag',
'gt3r/signal_photons/classed_pc_flag',
'gt3r/signal_photons/classed_pc_indx',
'gt3r/signal_photons/d_flag',
'gt3r/signal_photons/delta_time',
'gt3r/signal_photons/ph_h',
'gt3r/signal_photons/ph_segment_id',
'orbit_info/bounding_polygon_lat1',
'orbit_info/bounding_polygon_lon1',
'orbit_info/crossing_time',
'orbit_info/cycle_number',
'orbit_info/lan',
'orbit_info/orbit_number',
'orbit_info/rgt',
'orbit_info/sc_orient',
'orbit_info/sc_orient_time',
'quality_assessment/qa_granule_fail_reason',
'quality_assessment/qa_granule_pass_fail']
Thats a lot of variables!
One key feature of icepyx
is the ability to browse the variables available in the dataset. There are typically hundreds of variables in a single dataset, so that is a lot to sort through! Let’s take a moment to get oriented to the organization of ATL08 variables, by first a few important pieces of the algorithm.
To create higher level variables like canopy or terrain height, the ATL08 algorithms goes through a series of steps:
Identify signal photons from noise photons
Classify each of the signal photons as either terrain, canopy, or canopy top
Remove elevation, so the heights are with respect to the ground
Group the signal photons into 100m segments. If there are a sufficient number of photons in that group, calculate statistics for terrain and canopy (ex. mean height, max height, standard deviation, etc.)
Fig. 4. An example of the classified photons produced from the ATL08 algorithm. Ground photons (red dots) are labeled as all photons falling within a point spread function distance of the estimated ground surface. The top of canopy photons (green dots) are photons that fall within a buffer distance from the upper canopy surface, and the photons that lie between the top of canopy surface and ground surface are labeled as canopy photons (blue dots). (Neuenschwander & Pitts, 2019)
Load your variables#
reader.vars.append(var_list=['h_canopy', 'latitude', 'longitude'])
ds = reader.load()
ds
<xarray.Dataset> Size: 171kB Dimensions: (gran_idx: 1, photon_idx: 1852, spot: 6) Coordinates: * gran_idx (gran_idx) float64 8B 1.156e+05 * photon_idx (photon_idx) int64 15kB 0 1 2 3 ... 1848 1849 1850 1851 * spot (spot) uint8 6B 1 2 3 4 5 6 source_file (gran_idx) <U70 280B '/tmp/grandmesa_ATL08/processed... delta_time (photon_idx) datetime64[ns] 15kB 2019-12-11T14:40:39... Data variables: sc_orient (gran_idx) int8 1B 1 cycle_number (gran_idx) int8 1B 5 rgt (gran_idx, spot, photon_idx) float32 44kB nan ... nan atlas_sdp_gps_epoch (gran_idx) datetime64[ns] 8B 2018-01-01T00:00:18 data_start_utc (gran_idx) datetime64[ns] 8B 2019-12-11T14:35:19.988979 data_end_utc (gran_idx) datetime64[ns] 8B 2019-12-11T14:43:50.730291 latitude (spot, gran_idx, photon_idx) float32 44kB nan ... nan longitude (spot, gran_idx, photon_idx) float32 44kB nan ... nan gt (gran_idx, spot) object 48B 'gt3r' 'gt3l' ... 'gt1l' h_canopy (photon_idx) float32 7kB 6.826 8.899 ... 15.78 31.38 Attributes: data_product: ATL08 Description: Contains data categorized as land at 100 meter intervals. data_rate: Data are stored as aggregates of 100 meters.
ds.plot.scatter(x="longitude", y="latitude", hue="h_canopy")
<matplotlib.collections.PathCollection at 0x7f1eb1061e50>
Example 3: Reading a granule with h5py?#
import h5py
import numpy as np
f = h5py.File("/tmp/grandmesa_ATL08/processed_ATL08_20191211143520_11560506_006_01.h5", mode='r')
f["/"].keys()
<KeysViewHDF5 ['METADATA', 'ancillary_data', 'ds_geosegments', 'ds_metrics', 'ds_surf_type', 'gt1l', 'gt1r', 'gt2l', 'gt2r', 'gt3l', 'gt3r', 'orbit_info', 'quality_assessment']>
h_canopy = np.array(f["/gt1l/land_segments/canopy/h_canopy"])
h_canopy
Show code cell output
array([6.82592773e+00, 8.89868164e+00, 5.43261719e+00, 4.63061523e+00,
4.54296875e+00, 5.24804688e+00, 4.41186523e+00, 3.89916992e+00,
9.48120117e+00, 1.07646484e+01, 2.37392578e+01, 1.00693359e+01,
3.86791992e+00, 1.35986328e+00, 6.63916016e+00, 6.70996094e+00,
6.27221680e+00, 3.40282347e+38, 5.68359375e+00, 3.40282347e+38,
1.24887695e+01, 8.06347656e+00, 7.88720703e+00, 3.40747070e+00,
3.18945312e+00, 2.81152344e+00, 9.56201172e+00, 1.50603027e+01,
7.54663086e+00, 1.22834473e+01, 9.54809570e+00, 4.61621094e+00,
3.53637695e+00, 3.76904297e+00, 7.18701172e+00, 1.60043945e+01,
1.96469727e+01, 3.40282347e+38, 2.13913574e+01, 1.94482422e+01,
1.48708496e+01, 1.69958496e+01, 1.52304688e+01, 1.79970703e+01,
1.14118652e+01, 9.28051758e+00, 7.92846680e+00, 1.42802734e+01,
1.34736328e+01, 3.66406250e+00, 3.72558594e+00, 1.34113770e+01,
1.81997070e+01, 6.85571289e+00, 7.87841797e+00, 2.78906250e+00,
3.04589844e+00, 5.70214844e+00, 2.24389648e+00, 4.83789062e+00,
8.34985352e+00, 1.06958008e+01, 2.31843262e+01, 2.94582520e+01,
2.58112793e+01, 2.17145996e+01, 1.85002441e+01, 1.64414062e+01,
1.54970703e+01, 1.10722656e+01, 1.29038086e+01, 3.40282347e+38,
1.31000977e+01, 1.50708008e+01, 1.09299316e+01, 2.11633301e+01,
1.93339844e+01, 1.94951172e+01, 2.04709473e+01, 2.24191895e+01,
2.33315430e+01, 1.86020508e+01, 1.41843262e+01, 1.59016113e+01,
6.48730469e+00, 3.40282347e+38, 3.37695312e+00, 2.93798828e+00,
3.09497070e+00, 2.54467773e+00, 3.11889648e+00, 3.60913086e+00,
5.75415039e+00, 3.56518555e+00, 1.16108398e+01, 9.29272461e+00,
2.53686523e+00, 6.21337891e+00, 4.37084961e+00, 1.81105957e+01,
1.57587891e+01, 8.64038086e+00, 2.61328125e+00, 4.00317383e+00,
6.99169922e+00, 2.76660156e+00, 4.61401367e+00, 4.38916016e+00,
2.79565430e+00, 3.03881836e+00, 3.55322266e+00, 5.37475586e+00,
3.75170898e+00, 1.90629883e+01, 1.36696777e+01, 1.68891602e+01,
1.16149902e+01, 3.40282347e+38, 2.11640625e+01, 2.49699707e+01,
2.40285645e+01, 3.40282347e+38, 3.40282347e+38, 2.71843262e+01,
2.46640625e+01, 2.46933594e+01, 2.64416504e+01, 2.84414062e+01,
1.87243652e+01, 1.20036621e+01, 1.25017090e+01, 9.19995117e+00,
4.33105469e+00, 5.97875977e+00, 1.38188477e+01, 5.21801758e+00,
6.54418945e+00, 6.48632812e+00, 4.31347656e+00, 3.22729492e+00,
5.57275391e+00, 1.82556152e+01, 1.83542480e+01, 7.46899414e+00,
2.15588379e+01, 2.01042480e+01, 1.99946289e+01, 1.31286621e+01,
7.18554688e+00, 5.58422852e+00, 2.84724121e+01, 1.86489258e+01,
2.08405762e+01, 8.15869141e+00, 6.35327148e+00, 3.90673828e+00,
5.31713867e+00, 8.25903320e+00, 8.90551758e+00, 2.17082520e+01,
2.61423340e+01, 2.84587402e+01, 2.55974121e+01, 2.72182617e+01,
2.36420898e+01, 4.55541992e+00, 3.40282347e+38, 9.37011719e+00,
1.40803223e+01, 1.57014160e+01, 1.59594727e+01, 3.40282347e+38,
2.20192871e+01, 6.75805664e+00, 4.81054688e+00, 3.48828125e+00,
3.68432617e+00, 3.29418945e+00, 4.44946289e+00, 2.75512695e+00,
9.43627930e+00, 1.07866211e+01, 4.53369141e+00, 3.40282347e+38,
1.00212402e+01, 1.74226074e+01, 1.19001465e+01, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 1.67258301e+01, 3.40282347e+38,
1.42563477e+01, 1.44150391e+01, 1.20014648e+01, 3.40282347e+38,
9.43774414e+00, 9.73803711e+00, 3.40282347e+38, 1.24138184e+01,
1.20412598e+01, 3.40282347e+38, 1.15126953e+01, 6.58203125e+00,
8.82128906e+00, 3.22558594e+00, 1.13186035e+01, 6.20532227e+00,
2.64965820e+00, 9.68017578e+00, 7.39550781e+00, 1.27314453e+01,
1.74250488e+01, 1.58566895e+01, 2.07294922e+01, 8.82788086e+00,
8.45214844e+00, 1.06098633e+01, 5.94067383e+00, 7.09985352e+00,
2.08530273e+01, 1.72912598e+01, 1.64267578e+01, 1.67080078e+01,
1.84438477e+01, 2.04689941e+01, 2.22739258e+01, 5.44067383e+00,
3.40282347e+38, 3.40282347e+38, 3.79223633e+00, 7.29321289e+00,
4.28979492e+00, 9.01245117e+00, 6.86083984e+00, 9.46655273e+00,
5.01586914e+00, 9.97583008e+00, 5.12207031e+00, 5.49365234e+00,
8.35400391e+00, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
2.58269043e+01, 3.40282347e+38, 3.40282347e+38, 2.18022461e+01,
5.05737305e+00, 3.40282347e+38, 2.26381836e+01, 2.05205078e+01,
2.48500977e+01, 1.47624512e+01, 2.56430664e+01, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 2.75573730e+01, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 2.09458008e+01,
1.48725586e+01, 1.52705078e+01, 1.58483887e+01, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
5.34301758e+00, 1.09184570e+01, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 1.49362793e+01, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
1.65438232e+01, 1.66979980e+01, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38,
3.40282347e+38, 3.40282347e+38, 3.40282347e+38, 3.40282347e+38],
dtype=float32)
Part 2: SlideRule#
SlideRule is a collaborative effort between NASA Goddard Space Flight Center (GSFC) and the University of Washington, funded by the ICESat-2 program. It provides on-demand science data processing service for ICESat-2 and GEDI data that runs on Amazon Web Services (AWS) and responds to REST-like API calls to process and return science results. This science-data-as-a-service model is a new way for researchers to work and analyze data, enabling them to have low-latency access to custom-generated, high-level data products.
SlideRule users provide specific parameters at the time of the request to compute products that fit their science needs. SlideRule then uses cloud-optimized versions of computational algorithms and a scalable cluster of EC2 instances to process data efficiently. All data is then returned to the user as a geopandas
GeoDataFrame.
For more information#
Website: https://slideruleearth.io
Documentation: https://slideruleearth.io/web/rtd/
GitHub: SlideRuleEarth/sliderule
Examples: SlideRuleEarth/sliderule-python
Contact: support@mail.slideruleearth.io
# To use the latest version of the sliderule client, run this cell.
# It will install the sliderule Python client into your current conda environment.
# You will then need to restart your kernel to have the changes take effect.
%pip install --quiet "sliderule>=4.6"
Note: you may need to restart the kernel to use updated packages.
Example 1: Just Get Me Some Data#
# (1) Import the client
from sliderule import sliderule, icesat2
# (2) Initialize the client
sliderule.init("slideruleearth.io");
# (3) Define an area of interest
region = sliderule.toregion("grandmesa.geojson");
# (4) Specify the processing parameters
parms = {
"poly": region["poly"],
"srt": icesat2.SRT_LAND,
"len": 20.0,
"res": 100.0
}
# (5) Make the processing request
gdf = icesat2.atl06p(parms)
Display the results#
gdf
region | h_sigma | rms_misfit | spot | pflags | rgt | y_atc | w_surface_window_final | gt | x_atc | h_mean | segment_id | dh_fit_dx | cycle | n_fit_photons | geometry | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
time | ||||||||||||||||
2018-10-16 10:49:21.763047168 | 6 | 0.059213 | 0.414242 | 3 | 0 | 272 | 41194.648438 | 3.146032e+00 | 40 | 15710061.0 | 1797.827692 | 784344 | 0.121020 | 1 | 49 | POINT (-108.09813 39.15732) |
2018-10-16 10:49:21.773934080 | 6 | 0.171685 | 0.679926 | 6 | 0 | 272 | 44567.667969 | 3.917953e+00 | 10 | 15712286.0 | 2205.110892 | 784455 | 0.151762 | 1 | 17 | POINT (-108.06191 39.13431) |
2018-10-16 10:49:21.893560064 | 6 | 0.530147 | 1.702927 | 6 | 0 | 272 | 44545.421875 | 1.265432e+01 | 10 | 15713088.0 | 2260.114661 | 784495 | 0.182269 | 1 | 11 | POINT (-108.0631 39.12714) |
2018-10-16 10:49:22.009937920 | 6 | 0.035914 | 0.287156 | 1 | 0 | 272 | 37942.792969 | 1.717312e+01 | 60 | 15711985.0 | 1813.438733 | 784440 | 0.059123 | 1 | 64 | POINT (-108.13779 39.14302) |
2018-10-16 10:49:22.116894976 | 6 | 0.000000 | 0.000000 | 6 | 1 | 272 | 44486.722656 | 3.000000e+01 | 10 | 15714591.0 | 2549.139896 | 784570 | -0.791383 | 1 | 23 | POINT (-108.06554 39.11371) |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2024-05-05 21:42:08.529739776 | 2 | 6.008234 | 308.324646 | 1 | 0 | 737 | -1859.211670 | 3.588987e+20 | 10 | 4352867.5 | 1948.284281 | 217054 | 2.667867 | 23 | 2689 | POINT (-108.09843 39.12156) |
2024-05-05 21:42:08.594035200 | 2 | 5.008744 | 204.956665 | 4 | 0 | 737 | 1433.140137 | 3.588987e+20 | 40 | 4355633.5 | 1780.562661 | 217192 | -0.860279 | 23 | 1675 | POINT (-108.13939 39.14353) |
2024-05-05 21:42:08.608098560 | 2 | 0.000000 | 0.000000 | 4 | 1 | 737 | 1433.046631 | 1.914062e+01 | 40 | 4355734.0 | 1777.159349 | 217197 | -0.750229 | 23 | 1698 | POINT (-108.1395 39.14443) |
2024-05-05 21:42:08.664648704 | 2 | 4.936912 | 201.764343 | 4 | 0 | 737 | 1432.968018 | 3.588987e+20 | 40 | 4356135.0 | 2167.969007 | 217217 | -1.484943 | 23 | 1671 | POINT (-108.13994 39.14802) |
2024-05-05 21:42:08.881896704 | 2 | 4.585921 | 200.818405 | 1 | 0 | 737 | -1858.465698 | 3.588987e+20 | 10 | 4355373.0 | 1866.818607 | 217179 | 1.150830 | 23 | 1918 | POINT (-108.1012 39.14401) |
15904 rows × 16 columns
Plot the results#
import matplotlib.pyplot as plt
region_lon = [e["lon"] for e in region["poly"]]
region_lat = [e["lat"] for e in region["poly"]]
f, ax = plt.subplots()
ax.set_title("ATL06-SR Points")
ax.set_aspect('equal')
gdf.plot(ax=ax, column='h_mean', cmap='inferno', s=0.1)
ax.plot(region_lon, region_lat, linewidth=1, color='g');
plt.show()
Explanation of what happened#
(1) Import the client#
from sliderule import sliderule, icesat2
The SlideRule Python client is broken up into different modules:
sliderule
: core general functionalityicesat2
: ICESat-2 on-demand, subsetting, raster sampling productsgedi
: GEDI subsetting, and raster sampling productsh5
: direct HDF5 data accessearthdata
: CMR, CMR-STAC, TNM helper functions (useearthaccess
instead)io
: reading and writing results to/from local filesipysliderule
: toolbox for building SlideRule interfaces in a Jupyter notebook
(2) Initialize the client#
sliderule.init("slideruleearth.io");
Configure the client settings:
url
: address of sliderule service (default = “slideruleearth.io”)verbose
: display messages from server (default = False)loglevel
: criticality of log messages to display (default = logging.INFO)organization
: selection of cluster, used for private clusters (default = “sliderule”)desired_nodes
: number of nodes to run in a private cluster (default = None)time_to_live
: how long to deploy a private cluster (default = 60 minutes)bypass_dns
: query the provisioning system for IP address and don’t use DNS lookup hostname (default = False)plugins
: check if plugin is present (default = [])trust_env
: use netrc file for authentication (default = False)log_handler
: attach handler to client logging (default = None)rethrow
: immediately rethrow any caught exception inside of the client (default = None)
(3) Define an area of interest#
region = sliderule.toregion("grandmesa.geojson");
SlideRule uses an area of interest for determining which dataset resources to process and to then subset those resources to provide data only inside the area of interest. The sliderule.toregion
function converts multiple input types into a format understood by SlideRule. The inputs types supported are: geojson, shapefile, GeoDataFrame, list of coordinates, and a dictionary of coordinates.
The resources (e.g. granules) to process can always be supplied in any of the processing APIs. But if they are not supplied (which is typical), then to determine which resources to process, the SlideRule server-side code uses the area of interest to make requests to NASA’s Common Metadata Repository (CMR) legacy and STAC interfaces, along with USGS’s The National Map interface. The server code automatically determines which interfaces should be queried and the parameters of the query needed for properly filtering results.
In rare cases when the area of interest is very complex (e.g. a bunch of islands, or an extremely high vertice-count polygon), then the user can request the server to rasterize the area of interest and use it as a mask for determining which data to process. See https://slideruleearth.io/web/rtd/user_guide/SlideRule.html#geojson for more details.
(4) Specify the processing parameters#
parms = {
"poly": region["poly"],
"srt": icesat2.SRT_LAND,
"len": 20.0,
"res": 100.0
}
There is a multitude of processing parameters that are available to each API. The ones used here are:
poly
: area of interestsrt
: surface reference type; if set to -1 (or icesat2.DYNAMIC), then all surface types are usedlen
: length of the extent (or variable-length segment) of along-track photon clouds to use in processing each postingres
: the step size between postings
See user’s guide for additional parameters: https://slideruleearth.io/web/rtd/index.html
(5) Make the processing request#
gdf = icesat2.atl06p(parms)
Under-the-hood this makes an HTTP request to the SlideRule service running in AWS to perform the ATL06 surface-finding algorithm on ATL03 photons to produce an elevation, and then collects the results into a pandas GeoDataFrame.
The different ICESat-2 APIs available are:
atl03sp
: subset and filter ATL03 photons; provide custom YAPC and ATL08 classificationsatl03v
: fast segment level subsetting of ATL03 photonsatl06s
: subset the ATL06 land elevation productatl06p
: dynamically generate ATL06 surface elevation productatl08p
: dynamically generate the ATL08 vegetation density product (PhoREAL)atl13p
: subset the ATL13 coastal water product
Example 2: Sample GEDI Elevation Product at ICESat-2 Dynamically Generated Postings#
from sliderule import sliderule, icesat2, gedi
sliderule.init("slideruleearth.io", verbose=True);
Setting URL to slideruleearth.io
Login status to slideruleearth.io/sliderule: failure
parms = {
"poly": sliderule.toregion('grandmesa.geojson')['poly'],
"t0": '2019-11-14T00:00:00Z',
"t1": '2019-11-15T00:00:00Z',
"srt": icesat2.SRT_LAND,
"len": 100,
"res": 100,
"pass_invalid": False,
"atl08_class": ["atl08_ground", "atl08_canopy", "atl08_top_of_canopy"],
"atl08_fields": ["h_dif_ref"],
"phoreal": {"binsize": 1.0, "geoloc": "center", "use_abs_h": False, "send_waveform": False},
"samples": {"gedi": {"asset": "gedil3-elevation"}}
};
atl08 = icesat2.atl08p(parms)
request <AppServer.10153> retrieved 1 resources from CMR
proxy request <AppServer.10153> querying resources for gedi
proxy request <AppServer.10153> returned 0 resources for gedi
Starting proxy for atl08 to process 1 resource(s) with 1 thread(s)
request <AppServer.10171> processing initialized on ATL03_20191114034331_07370502_006_01.h5 ...
Successfully completed processing resource [1 out of 1]: ATL03_20191114034331_07370502_006_01.h5
atl08
gt | h_min_canopy | rgt | veg_ph_count | landcover | x_atc | h_mean_canopy | segment_id | h_te_median | canopy_h_metrics | ... | gnd_ph_count | h_max_canopy | cycle | canopy_openness | geometry | h_dif_ref | gedi.value | gedi.file_id | gedi.flags | gedi.time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
time | |||||||||||||||||||||
2019-11-14 03:46:36.935118336 | 10 | 0.501465 | 737 | 27 | 30 | -6.882999e+10 | 1.156670 | 215507 | 1958.305542 | (1.479736328125, 1.479736328125, 1.47973632812... | ... | 75 | 2.261719 | 5 | 0.443189 | POINT (-108.12262 38.83912) | 0.963623 | 1777.066040 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:36.949218304 | 10 | 0.515503 | 737 | 59 | 30 | -6.882593e+10 | 1.304013 | 215512 | 1964.416748 | (1.4913330078125, 1.4913330078125, 1.491333007... | ... | 54 | 3.137817 | 5 | 0.636816 | POINT (-108.12272 38.84002) | -3.216064 | 1925.270142 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:36.963318272 | 10 | 0.515259 | 737 | 54 | 30 | -6.882186e+10 | 1.834195 | 215517 | 1976.178833 | (1.5015869140625, 1.5015869140625, 1.501586914... | ... | 49 | 4.442627 | 5 | 0.892432 | POINT (-108.12283 38.84092) | -5.043213 | 1925.270142 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:36.977417984 | 10 | 0.804443 | 737 | 68 | 30 | -6.881773e+10 | 2.465483 | 215522 | 1991.423218 | (1.7940673828125, 1.7940673828125, 1.794067382... | ... | 35 | 6.167480 | 5 | 1.023235 | POINT (-108.12295 38.84182) | 2.426270 | 1925.270142 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:36.980918016 | 30 | 0.508423 | 737 | 95 | 20 | -6.927786e+10 | 2.816030 | 215529 | 1822.414673 | (1.5048828125, 1.5048828125, 1.5048828125, 1.5... | ... | 19 | 7.769043 | 5 | 1.389679 | POINT (-108.08651 38.84583) | 5.912720 | 1779.992554 | 0 | 0 | 1.326586e+12 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2019-11-14 03:46:42.283918336 | 60 | 0.520142 | 737 | 38 | 40 | -6.851285e+10 | 2.928200 | 217296 | 1718.457764 | (1.51220703125, 1.51220703125, 1.51220703125, ... | ... | 310 | 7.713501 | 5 | 2.058075 | POINT (-108.08783 39.16624) | -0.977173 | 1781.542358 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:42.293068288 | 20 | 0.504028 | 737 | 130 | 30 | -6.805046e+10 | 3.273977 | 217288 | 1786.887939 | (1.5029296875, 1.5029296875, 1.5029296875, 1.5... | ... | 192 | 8.225342 | 5 | 2.291607 | POINT (-108.16125 39.1593) | -3.731567 | 1797.069336 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:42.293818368 | 40 | 0.000000 | 737 | 0 | 40 | -6.828026e+10 | NaN | 217294 | 1709.912354 | (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, ... | ... | 272 | 0.000000 | 5 | NaN | POINT (-108.12461 39.16313) | -0.086914 | 1720.139282 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:42.295218176 | 60 | 0.548218 | 737 | 11 | 40 | -6.851079e+10 | 4.089300 | 217301 | 1716.896362 | (0.8890380859375, 0.8890380859375, 0.889038085... | ... | 191 | 10.701416 | 5 | 3.587460 | POINT (-108.08792 39.16696) | -2.204712 | 1781.542358 | 0 | 0 | 1.326586e+12 |
2019-11-14 03:46:42.304318208 | 20 | 0.511963 | 737 | 13 | 20 | -6.804797e+10 | 0.724375 | 217293 | 1784.938721 | (1.41064453125, 1.41064453125, 1.41064453125, ... | ... | 161 | 1.410645 | 5 | 0.226650 | POINT (-108.16134 39.16002) | -1.957886 | 1797.069336 | 0 | 0 | 1.326586e+12 |
2110 rows × 25 columns
Plot the results#
import matplotlib.pyplot as plt
import numpy as np
plt.figure(figsize=[8,6])
d0=np.min(atl08['x_atc'])
plt.plot(atl08['x_atc']-d0, atl08['h_te_median'], 'o', markersize=1, color='green', label='h_mean_canopy')
plt.plot(atl08['x_atc']-d0, atl08['gedi.value'], 'o', markersize=1, color='gray', label='gedi elevation')
hl=plt.legend(loc=3, frameon=False, markerscale=5)
plt.gca().set_ylim([1500, 3500])
(1500.0, 3500.0)
plt.show()
Explanation of what’s new#
The on-demand ATL08 product (different than the ICESat-2 Standard Data Product) was generated and streamed back to the user. The ATL08 on-demand product uses University of Texas at Austin’s PhoREAL algorithm which was integrated into SlideRule to generate customizable vegetation metrics using ATL03 photon data.
A time range was specified in the request limiting the results to data collected only between the start and stop times supplied.
The
"atl08_class"
parameter specified that only photons in ATL03 that were classified as"atl08_ground"
,"atl08_canopy"
, or"atl08_top_of_canopy"
in the ATL08 standard data product are to be supplied to the PhoREAL algorithm and used in the results.The
"atl08_fields"
parameter specifies that the"h_dif_ref"
variable from the ATL08 standard data product is to be associated with each result returned by SlideRule. SlideRule attempts to find the value of the variable closest in time to the dynamically generated result.The
"phoreal"
parameter provides the processing parameters for the PhoREAL algorithm.The
"samples"
parameter provides a list of raster datasets that SlideRule should sample at each generated result. So for each 100m segment that PhoREAL processes, the server-side code will also sample thegedil3-elevation
product at the latitude and longitude of that segment and return the value with the results.
For a list of raster datasets that are available to sample in SlideRule, see: https://slideruleearth.io/web/rtd/user_guide/GeoRaster.html#asset-directory
Example 3: Produce GeoParquet of Coastal Photons#
from sliderule import sliderule, icesat2
import geopandas as gpd
sliderule.init(verbose=True)
Setting URL to slideruleearth.io
Login status to slideruleearth.io/sliderule: failure
True
region = sliderule.toregion("bathy.geojson");
# ATL03 subsetting request parameters
parms = {
"poly": region['poly'],
"srt": icesat2.SRT_DYNAMIC,
"len": 100,
"res": 100,
"pass_invalid": True,
"output": {
"asset":"sliderule-stage",
"format": "parquet",
"as_geo": True,
"open_on_complete": False
}
}
atl03_url = icesat2.atl03sp(parms, resources=['ATL03_20230213042035_08341807_006_02.h5'])
Starting proxy for atl03s to process 1 resource(s) with 1 thread(s)
request <AppServer.10175> processing initialized on ATL03_20230213042035_08341807_006_02.h5 ...
request <AppServer.10175> processing of ATL03_20230213042035_08341807_006_02.h5 complete (366784/0/0)
Initiated upload of results to S3, bucket = sliderule-public, key = sliderule.00000015394A5FCC.geoparquet
Upload to S3 completed, bucket = sliderule-public, key = sliderule.00000015394A5FCC.geoparquet, size = 10133778
atl03_url
's3://sliderule-public/sliderule.00000015394A5FCC.geoparquet'
# Recent issues with pandas and geopandas have made direct reads temperamental
# atl03 = gpd.pd.read_parquet(atl03_url)
import boto3
atl03_url_tokens = atl03_url.split('/')
s3_client = boto3.client('s3')
s3_client.download_file(atl03_url_tokens[2], atl03_url_tokens[3], "/tmp/" + atl03_url_tokens[3])
atl03 = gpd.read_parquet("/tmp/" + atl03_url_tokens[3])
atl03.keys()
Index(['extent_id', 'x_atc', 'landcover', 'y_atc', 'atl03_cnf', 'atl08_class',
'snowcover', 'quality_ph', 'yapc_score', 'relief', 'height', 'cycle',
'pair', 'sc_orient', 'rgt', 'track', 'background_rate', 'segment_id',
'segment_dist', 'solar_elevation', 'region', 'geometry'],
dtype='object')
Plot the results#
import matplotlib.pyplot as plt
import numpy as pd
df = atl03
df = df[df["pair"] == icesat2.LEFT_PAIR]
df = df[df["track"] == 3]
plt.figure(figsize=[8,6])
plt.plot(df['x_atc']+df['segment_dist'], df['height'], 'o', markersize=1, color='blue')
[<matplotlib.lines.Line2D at 0x7f1e9f79b4d0>]
plt.show()
Part 3: H5Coro - The HDF5 Cloud-Optimized Read-Only Python Package#
h5coro
is a pure Python implementation of a subset of the HDF5 specification that has been optimized for reading data out of S3.
The project has its roots in SlideRule, where a new C++ implementation of the HDF5 specification was developed for performant read access to Earth science datasets stored in AWS S3. Over time, user’s of SlideRule began requesting the ability to performantly read HDF5 and NetCDF files out of S3 from their own Python scripts. The result is h5coro
: the re-implementation in Python of the core HDF5 reading logic that exists in SlideRule. Since then, h5coro
has become its own project, which will continue to grow and diverge in functionality from its parent implementation.
h5coro
is optimized for reading HDF5 data in high-latency high-throughput environments. It accomplishes this through a few key design decisions:
All reads are concurrent. Each dataset and/or attribute read by h5coro is performed in its own thread.
Intelligent range gets are used to read as many dataset chunks as possible in each read operation. This drastically reduces the number of HTTP requests to S3 and means there is no longer a need to re-chunk the data (it actually works better on smaller chunk sizes due to the granularity of the request).
Block caching is used to minimize the number of GET requests made to S3. S3 has a large first-byte latency (we’ve measured it at ~60ms on our systems), which means there is a large penalty for each read operation performed. h5coro performs all reads to S3 as large block reads and then maintains data in a local cache for access to smaller amounts of data within those blocks.
The system is serverless and does not depend on any external services to read the data. This means it scales naturally as the user application scales, and it reduces overall system complexity.
No metadata repository is needed. The structure of the file are cached as they are read so that successive reads to other datasets in the same file will not have to re-read and re-build the directory structure of the file.
For more information:#
GitHub: SlideRuleEarth/h5coro
# To use the latest version of the sliderule client, run this cell.
# It will install the sliderule Python client into your current conda environment.
# You will then need to restart your kernel to have the changes take effect.
%pip install --quiet "h5coro>=0.0.7"
Note: you may need to restart the kernel to use updated packages.
Example 1: Read ATL03 variables for bathymetry#
# (1) Import modules
from h5coro import h5coro, s3driver
import earthaccess
# (2) Authenticate to Earth Data Login
auth = earthaccess.login()
s3_creds = auth.get_s3_credentials(daac="NSIDC")
# (3) Initialize h5coro object
granule = "nsidc-cumulus-prod-protected/ATLAS/ATL03/006/2023/02/13/ATL03_20230213042035_08341807_006_02.h5"
h5obj = h5coro.H5Coro(granule, s3driver.S3Driver, errorChecking=True, verbose=False, credentials=s3_creds, multiProcess=False)
# (4) Read the data
variables = ["/gt3l/heights/h_ph", "/gt3l/heights/dist_ph_along", "/gt3l/geolocation/segment_dist_x", "/gt3l/geolocation/segment_ph_cnt"]
promise = h5obj.readDatasets(variables, block=True, enableAttributes=False)
for variable in promise:
print(f'{variable}: {promise[variable][0:10]}')
gt3l/heights/h_ph: [-47.941536 -51.9231 -48.09843 -47.873924 -48.12945 -48.118694
-48.308052 -48.208042 -47.802708 -48.004234]
gt3l/heights/dist_ph_along: [0.7542868 0.76623714 1.4717534 2.187351 2.1880984 2.9048157
2.905563 3.621905 4.337497 5.0545855 ]
gt3l/geolocation/segment_dist_x: [17068770.48934802 17068790.54479094 17068810.60023396 17068830.65567708
17068850.7111203 17068870.76656362 17068890.82200704 17068910.87745056
17068930.93289417 17068950.98833789]
gt3l/geolocation/segment_ph_cnt: [37 25 44 39 22 40 37 42 35 38]
Explanation of what happened#
(1) Import the necessary packages to use h5coro
.#
h5coro
relies on earthaccess
for authenticating to Earth Data Login. The modules a user might want to import are:
s3driver
: for reading data out of an s3 bucketfiledriver
: for reading data out of a local filewebdriver
: for reading data diretly over https (including objects in s3 buckets)logger
: for configuring the logging in h5coro
(2) Authenticate to Earth Data Login#
In my system I have a .netrc
file setup with the following line:
machine urs.earthdata.nasa.gov login <my_user_name> password <my_password>
(3) Create an h5coro object for the granule that you want to read#
h5coro
is object oriented, so all context information associated with the provided granule is stored in the object. Note that the full path to the granule is needed, including the s3 bucket.
(4) Read the data#
h5coro
implements an asynchronous I/O interface, meaning that when the readDatasets
function is called, it makes a read “request” in the background and returns immediately back to the caller. The caller receives something called a “promise” (or “future”) which is a promise that data will be there in the future at some point. You then can do other things while you wait, and when you finally need the data, you have to “block” or wait for it to be available.
In this example, I set the “block” parameter to True so that it would wait right away. But in more sophisticated examples, other work could have been done by the notebook while waiting for the results of the read.