
Introduction
Satellite imagery, especially from platforms like Sentinel and Landsat, provides valuable data for monitoring Earth’s surface. In this tutorial, we’ll walk through the process of creating a time series of cloud-free imagery using the HLS (Harmonized Landsat Sentinel) dataset. We’ll then export this time series to NetCDF format for further analysis.
Setting up Configuration in Sentinel Hub
- Login to Sentinel Hub: Go to Sentinel Hub and log in with your credentials. If you don’t have credential then you have to sign up.
- Configuration Utility: Navigate to the Configuration Utility to set up the layers. Once you navigate to Configuration Utility, following screen will popup. Now click on New Layer to create one

- Once you click new layer, following screen will open. Give ID as HLS and Select Source as Harmonized Landsat Sentinel. Setting ID as HLS and selecting Source as Harmonized Landsat Sentinel is very important because our python code is configured to use HLS layer ID. Make sure you get it right in the configuration.
After that, click on edit button in Data Processing which can be used for setting custom script.
- Once you click on edit button, below screen will open. Paste below script and click on Set Custom Script. The script can be used to filter and process the data on the fly from sentinel

- Get Instance ID: After configuring the layers, obtain the instance ID from the configuration utility. This ID will be used in the code to access the data.

Setting up the Code
Now, let’s understand and set up the Python code provided earlier.
- Install Required Packages:
conda install -c conda-forge requests pyproj base64 rasterio xarray rioxarray numpy matplotlib json netCDF4 shapely - Understanding the Code: The code uses the Sentinel Hub Web Feature Service (WFS) to query and download HLS data. It converts the geographical coordinates of the area of interest from WGS84 to UTM, retrieves cloud-free imagery, and exports the data to NetCDF format.
- Replace Configuration Values:
- Replace
instance_idwith the instance ID obtained from Sentinel Hub. - Adjust the
bboxWgscoordinates to define the bounding box of your area of interest.
- Replace
- Run the Code: Execute the Python script, and it will download HLS time series data, filter out cloudy images, and export the results to NetCDF format.
Complete Code:
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 |
import requests from pyproj import Transformer import base64 import xarray as xr import rioxarray from rasterio.io import MemoryFile from shapely.geometry import Polygon, mapping import json # Define the geometry geometry_json = '''{ "type": "Polygon", "coordinates": [ [ [ 150.1126865, -29.2255068 ], [ 150.1039762, -29.2243777 ], [ 150.103701199999989, -29.2245019 ], [ 150.1035071, -29.2245142 ], [ 150.0918797, -29.2230173 ], [ 150.0917524, -29.222976 ], [ 150.091697399999987, -29.2229369 ], [ 150.0916207, -29.2228069 ], [ 150.0916454, -29.2225624 ], [ 150.0934881, -29.2114467 ], [ 150.0936721, -29.2104342 ], [ 150.0937548, -29.2102981 ], [ 150.0939214, -29.2102237 ], [ 150.094009699999987, -29.2102251 ], [ 150.0941322, -29.2102609 ], [ 150.0947937, -29.2108058 ], [ 150.0948923, -29.2109237 ], [ 150.094995899999986, -29.2110963 ], [ 150.0950344, -29.2111974 ], [ 150.095121299999988, -29.2117 ], [ 150.0951761, -29.2118721 ], [ 150.0953447, -29.2122385 ], [ 150.095551300000011, -29.2125781 ], [ 150.0959033, -29.2130318 ], [ 150.0962366, -29.2134046 ], [ 150.096890800000011, -29.2139772 ], [ 150.0971956, -29.2141825 ], [ 150.0974879, -29.214272 ], [ 150.0977446, -29.2142941 ], [ 150.0979903, -29.214285 ], [ 150.098398800000012, -29.2141977 ], [ 150.0990562, -29.2139084 ], [ 150.099899, -29.2138841 ], [ 150.100123700000012, -29.2138351 ], [ 150.100327600000014, -29.2137415 ], [ 150.1012307, -29.2135528 ], [ 150.1015282, -29.2136282 ], [ 150.101817399999987, -29.2137781 ], [ 150.1025439, -29.214377 ], [ 150.102902199999988, -29.214617 ], [ 150.1031384, -29.2147156 ], [ 150.103347, -29.2147458 ], [ 150.103499199999987, -29.2147401 ], [ 150.1037307, -29.21469 ], [ 150.1044364, -29.2144067 ], [ 150.105561, -29.2141905 ], [ 150.105702900000011, -29.2142384 ], [ 150.1058581, -29.2143345 ], [ 150.1065556, -29.2149181 ], [ 150.106759899999986, -29.2149985 ], [ 150.1069929, -29.2150357 ], [ 150.1071355, -29.2150346 ], [ 150.107311399999986, -29.214998 ], [ 150.1081442, -29.2146057 ], [ 150.1083688, -29.2145261 ], [ 150.108617, -29.214474 ], [ 150.10909620000001, -29.2144519 ], [ 150.1107245, -29.2144974 ], [ 150.1113245, -29.2144695 ], [ 150.111787099999987, -29.2143629 ], [ 150.1127861, -29.2139822 ], [ 150.1131816, -29.2139067 ], [ 150.1134385, -29.2139007 ], [ 150.1136579, -29.2139268 ], [ 150.1138369, -29.2139933 ], [ 150.1139475, -29.2140779 ], [ 150.114047699999986, -29.2142552 ], [ 150.1140412, -29.2145935 ], [ 150.1138338, -29.2162949 ], [ 150.113470299999989, -29.2186687 ], [ 150.1135133, -29.2187373 ], [ 150.1137865, -29.2188267 ], [ 150.1139101, -29.2189072 ], [ 150.113972, -29.2189902 ], [ 150.1140024, -29.2191226 ], [ 150.113, -29.2252668 ], [ 150.112932, -29.2254101 ], [ 150.1128458, -29.2254765 ], [ 150.1126865, -29.2255068 ] ] ] }''' geometry = json.loads(geometry_json) # Define bounding box in WGS84 bbox_wgs = [150.0916207, -29.2255068, 150.1140477, -29.2102237] # Transformation from WGS84 to UTM transformer = Transformer.from_crs(4326, 32756) bbox_utm = list(transformer.transform( bbox_wgs[1], bbox_wgs[0])) + list(transformer.transform(bbox_wgs[3], bbox_wgs[2])) # Round all values in bbox_utm to nearest 30m bbox_utm = [round(x / 30) * 30 for x in bbox_utm] # Expand bbox by 30m in each direction bbox_utm = [bbox_utm[0] - 30, bbox_utm[1] - 30, bbox_utm[2] + 30, bbox_utm[3] + 30] # Convert geometry from WGS84 to UTM coords_wgs84 = geometry['coordinates'][0] coords_utm = [list(transformer.transform(coord[1], coord[0])) for coord in coords_wgs84] geom_utm = {"type": "Polygon", "coordinates": [coords_utm]} # Create a shapely polygon with a negative buffer of 30 m polygon = Polygon(coords_utm) polygon = polygon.buffer(-30) geom_utm = mapping(polygon) # Sentinel Hub instance ID instance_id = 'af099e61-1a8f-4e8e-8b0f-a89a35b586f2' # Define WFS parameters feature_offset = 0 bbox = str(bbox_utm)[1:-1] date_start = '2022-06-01' date_finish = '2022-07-01' wfs_url = f'https://services.sentinel-hub.com/ogc/wfs/{instance_id}?REQUEST=GetFeature&FEATURE_OFFSET={feature_offset}&srsName=EPSG:32756&TYPENAMES=DSS21&BBOX={bbox}&TIME={date_start}/{date_finish}&OUTPUTFORMAT=application/json' # Fetch WFS data more_data = True dates = [] while more_data: response = requests.get(wfs_url).json() features = response['features'] for feature in features: dates.append({ 'date': feature['properties']['date'], 'crs': feature['properties']['crs'], 'cloud': feature['properties']['cloudCoverPercentage'], 'id': feature['properties']['id'] }) if len(features) < 100: more_data = False else: feature_offset += 100 wfs_url = f'https://services.sentinel-hub.com/ogc/wfs/{instance_id}?REQUEST=GetFeature&FEATURE_OFFSET={feature_offset}&srsName=EPSG:3857&TYPENAMES=DSS21&BBOX={bbox}&TIME={date_start}/{date_finish}&OUTPUTFORMAT=application/json' print(dates[0]) # Define bands and metadata bands = [ {'name': 'Blue', 'factor': 0.001, 'constellation': 'both', 'id': 0}, {'name': 'Green', 'factor': 0.001, 'constellation': 'both', 'id': 1}, {'name': 'Red', 'factor': 0.001, 'constellation': 'both', 'id': 2}, {'name': 'RedEdge1', 'factor': 0.0025, 'constellation': 'sentinel', 'id': 3}, {'name': 'RedEdge2', 'factor': 0.0025, 'constellation': 'sentinel', 'id': 4}, {'name': 'RedEdge3', 'factor': 0.0025, 'constellation': 'sentinel', 'id': 5}, {'name': 'NIR_Narrow', 'factor': 0.0025, 'constellation': 'both', 'id': 6}, {'name': 'SWIR1', 'factor': 0.0025, 'constellation': 'both', 'id': 7}, {'name': 'SWIR2', 'factor': 0.0025, 'constellation': 'both', 'id': 8}, {'name': 'ThermalInfrared1', 'factor': 0.1, 'constellation': 'landsat', 'id': 9}, {'name': 'ThermalInfrared2', 'factor': 0.1, 'constellation': 'landsat', 'id': 10}, ] metadata = { 'width': None, 'height': None, 'bounds': None, 'crs': 'EPSG:32756', 'transform': None, 'count': None, } xds = xr.Dataset({}, coords={'x': [], 'y': [], 'time': []}) # Loop through dates and bands to fetch and process data for date_info in dates[:10]: # Check QA band first to determine cloud cover before downloading other bands evalscript = '''//VERSION=3 function setup() { return { input: [{ bands: ["QA", "dataMask"], }], output: { bands: 1, sampleType: "UINT8" } } } function rs(num, bits) { var divisor = Math.pow(2, bits); var result = Math.floor(num / divisor); return result; } function evaluatePixel(sample) { var cloud = (rs(sample.QA, 1) & 1) var adj_cld_shadow = (rs(sample.QA, 2) & 1) var cloud_shadow = (rs(sample.QA, 3) & 1) var snow_ice = (rs(sample.QA, 4) & 1) var water = (rs(sample.QA,5) & 1) if (sample.dataMask == 0) { return [1] } if (cloud == 1) { return [1] } else if (cloud_shadow == 1) { return [1] } else if (adj_cld_shadow == 1) { return [1] } else if (snow_ice == 1) { return [2] } else if (water == 1) { return [2] } else { return [2] } }''' evalscript = base64.b64encode(evalscript.encode('ascii')).decode('ascii') wcs_url = f'https://services-uswest2.sentinel-hub.com/ogc/wcs/{instance_id}?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.1.2&COVERAGE=HLS&CRS=EPSG:32756&STYLES=&RESX=30.0&RESY=30.0&FORMAT=image/tiff&TIME={date_info["date"]}/{date_info["date"]}&BBOX={bbox}&EVALSCRIPT={evalscript}&SHOWLOGO=false' # Download data using requests response = requests.get(wcs_url) # If status is 400, print error message if response.status_code == 400: print(response.text) with MemoryFile(response.content) as memfile: with memfile.open() as dataset: if metadata['width'] is None: metadata['width'] = dataset.width metadata['height'] = dataset.height metadata['bounds'] = dataset.bounds metadata['transform'] = dataset.transform data = dataset.read() temp_array = rioxarray.open_rasterio( dataset, engine='rasterio', masked=True) clipped = temp_array.rio.clip([geom_utm], all_touched=False) array = xr.DataArray(clipped[0], dims=['y', 'x'], name='QA') array.rio.write_crs('EPSG:32756', inplace=True) array.attrs['id'] = date_info['id'] array = array.expand_dims('time') array['time'] = [date_info['date']] xds = xr.merge([xds, array]) good_pixels = int(clipped.where( clipped == 2).count(dim=['x', 'y'])[0]) bad_pixels = int(clipped.where( clipped == 1).count(dim=['x', 'y'])[0]) good_percent = round( (good_pixels / (good_pixels + bad_pixels)) * 100, 2) # If good percent is less than 97%, then skip if good_percent < 97: print('Skipping') continue for band_info in bands: evalscript_band = f''' //VERSION=3 let f = {band_info['factor']}; function setup() {{ return {{ input: [{{ bands: ["{band_info['name']}"], }}], output: {{ bands: 1, sampleType: "UINT8" }} }}; }} function evaluatePixel(sample) {{ return [sample.{band_info['name']} / f]; }} ''' # Check constellation constellation = 'landsat' if date_info['id'].split( '.')[1] == 'L30' else 'sentinel' if band_info['constellation'] == 'both': evalscript_band = base64.b64encode( evalscript_band.encode('ascii')).decode('ascii') wcs_url_band = f'https://services-uswest2.sentinel-hub.com/ogc/wcs/{instance_id}?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.1.2&COVERAGE=HLS&CRS=EPSG:32756&STYLES=&RESX=30.0&RESY=30.0&FORMAT=image/tiff&TIME={date_info["date"]}/{date_info["date"]}&BBOX={bbox}&EVALSCRIPT={evalscript_band}&SHOWLOGO=false' # Download data using requests response_band = requests.get(wcs_url_band) # If status is 400, print error message if response_band.status_code == 400: print(response_band.text) with MemoryFile(response_band.content) as memfile_band: with memfile_band.open() as dataset_band: data_band = dataset_band.read() temp_array_band = rioxarray.open_rasterio( dataset_band, engine='rasterio', masked=True) clipped_band = temp_array_band.rio.clip( [geom_utm], all_touched=False) array_band = xr.DataArray(clipped_band[0], dims=[ 'y', 'x'], name=band_info['name']) array_band.rio.write_crs( 'EPSG:32756', inplace=True) array_band.attrs['id'] = date_info['id'] array_band = array_band.expand_dims('time') array_band['time'] = [date_info['date']] xds = xr.merge([xds, array_band]) # Export with preserved projection def export_to_netcdf(dataset, filename, encoding=None): if encoding is not None: dataset = dataset.copy() for var_name, var in dataset.data_vars.items(): if var_name in encoding: var.encoding = encoding[var_name] # Write CRS dataset.rio.write_crs("EPSG:32756", inplace=True) # Write to a NetCDF file dataset.to_netcdf(filename) # Define encoding for NetCDF encoding_netcdf = { 'QA': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0}, 'Blue': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0}, 'Green': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0}, 'Red': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0}, 'NIR_Narrow': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0}, 'SWIR1': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0}, 'SWIR2': {'dtype': 'uint8', 'zlib': True, 'complevel': 9, '_FillValue': 0} } # Export to NetCDF files export_to_netcdf(xds, 'processed_data.nc', encoding=encoding_netcdf) |
Let’s break down the provided code step by step:
This section sets up the geometry, bounding box, and converts coordinates from WGS84 to UTM.
1. Geometry and Bounding Box Setup
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
pythonCopy code<code># Define the polygon geometry and bounding box in WGS84 coordinates geom = { "type": "Polygon", "coordinates": [ [ [ ... ] ] ] } bboxWgs = [150.0916207, -29.2255068, 150.1140477, -29.2102237] # Convert WGS84 bounding box to UTM coordinates transformer = Transformer.from_crs(4326, 32756) bboxUtm = list(transformer.transform(bboxWgs[1], bboxWgs[0])) + list(transformer.transform(bboxWgs[3], bboxWgs[2])) bboxUtm = [round(x/30)*30 for x in bboxUtm] bboxUtm = [bboxUtm[0]-30, bboxUtm[1]-30, bboxUtm[2]+30, bboxUtm[3]+30] # Convert geometry from WGS84 to UTM and create a Shapely polygon coords_wgs84 = geom['coordinates'][0] coords_utm = [list(transformer.transform(coord[1], coord[0])) for coord in coords_wgs84] polygon = Polygon(coords_utm).buffer(-30) geom_utm = mapping(polygon) |
2. Setting Up Sentinel Hub Configuration
This section initializes the Sentinel Hub configuration, including the instance ID, feature offset, time range, and WFS (Web Feature Service) URL.
|
1 2 3 4 5 6 7 8 9 |
pythonCopy code<code># Set Sentinel Hub instance ID instance_id = 'your_instance_id' # Set initial feature offset, time range, and Sentinel Hub WFS URL feature_offset = 0 date_start = '2022-06-01' date_finish = '2022-07-01' url = f'https://services.sentinel-hub.com/ogc/wfs/{instance_id}?REQUEST=GetFeature&FEATURE_OFFSET={feature_offset}&srsName=EPSG:{epsg}&TYPENAMES=DSS21&BBOX={bbox}&TIME={date_start}/{date_finish}&OUTPUTFORMAT=application/json' |
3. Retrieve HLS Data and Process
This section retrieves HLS (Heterogeneous Land Surface) data features by making requests to the Sentinel Hub WFS and collects information such as date, CRS (Coordinate Reference System), cloud cover percentage, and feature ID.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
pythonCopy code<code># Loop to fetch HLS data features using Sentinel Hub WFS dates = [] while more: r = requests.get(url).json() features = r['features'] for feature in features: dates.append({'date': feature['properties']['date'], 'crs': feature['properties']['crs'], 'cloud': feature['properties']['cloudCoverPercentage'], 'id': feature['properties']['id']}) if len(features) < 100: more = False else: feature_offset += 100 url = f'https://services.sentinel-hub.com/ogc/wfs/{instance_id}?REQUEST=GetFeature&FEATURE_OFFSET={feature_offset}&srsName=EPSG:3857&TYPENAMES=DSS21&BBOX={bbox}&TIME={date_start}/{date_finish}&OUTPUTFORMAT=application/json' # Define bands and metadata bands = [...] metadata = {'width': None, 'height': None, 'bounds': None, 'crs': f'EPSG:{epsg}', 'transform': None, 'count': None} xds = xr.Dataset({}, coords={'x': [], 'y': [], 'time': []}) |
4. Download and Process Data for Each Date
This section downloads HLS data for each date, including checking the QA (Quality Assurance) band for cloud cover, evaluating cloud mask, and fetching specific bands.
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
pythonCopy code<code># Loop through selected dates and download HLS data for date in dates[:10]: # Check QA band to determine cloud cover evalscript = '...' # QA band evaluation script evalscript = base64.b64encode(evalscript.encode('ascii')).decode('ascii') # Construct and send WCS request to get cloud mask url = f'https://services-uswest2.sentinel-hub.com/ogc/wcs/{instance_id}?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.1.2&COVERAGE=HLS&CRS=EPSG:{epsg}&STYLES=&RESX=30.0&RESY=30.0&FORMAT=image/tiff&TIME={date["date"]}/{date["date"]}&BBOX={bbox}&EVALSCRIPT={evalscript}&SHOWLOGO=false' r = requests.get(url) # Process cloud mask data # ... # Check if the good_percent is acceptable and skip if not if good_percent < 97: print('Skipping') continue # Loop through selected bands and download HLS data for band in bands: evalscript = '...' # Band evaluation script evalscript = base64.b64encode(evalscript.encode('ascii')).decode('ascii') # Construct and send WCS request to get band data url = f'https://services-uswest2.sentinel-hub.com/ogc/wcs/{instance_id}?SERVICE=WCS&REQUEST=GetCoverage&VERSION=1.1.2&COVERAGE=HLS&CRS=EPSG:{epsg}&STYLES=&RESX=30.0&RESY=30.0&FORMAT=image/tiff&TIME={date["date"]}/{date["date"]}&BBOX={bbox}&EVALSCRIPT={evalscript}&SHOWLOGO=false' r = requests.get(url) # Process band data # ... # Export data to NetCDF export_to_netcdf(xds, 'g5.nc') export_to_netcdf(xds, 'g5_small.nc', encoding=encoding) |
5. NetCDF Export
This section defines a function to export the processed data to NetCDF format, specifying encoding for each variable and exporting the data to NetCDF files.
|
1 |
pythonCopy code<code># Define NetCDF export function<br>def export_to_netcdf(xds, filename, encoding=None):<br> # ...<br><br># Define encoding for each variable<br>encoding = {...}<br><br># Export to NetCDF with or without encoding<br>export_to_netcdf(xds, 'g5.nc')<br>export_to_netcdf(xds, 'g5_small.nc', encoding=encoding)<br> |
This code fetches HLS data from Sentinel Hub, applies cloud masking, and downloads specific bands for a defined region and time period. The resulting data is then stored in NetCDF format for further analysis. Make sure to replace 'your_instance_id' with your actual Sentinel Hub instance ID before running the code.
Conclusion
Congratulations! You’ve successfully created a time series of cloud-free HLS imagery and exported it to NetCDF. This workflow is valuable for various applications, including environmental monitoring and land cover analysis.
Feel free to customize the code further based on your specific needs and explore additional functionalities offered by Sentinel Hub.
I hope this tutorial will create a good foundation for you. If you want tutorials on another GIS topic or you have any queries, please send an email at contact@spatial-dev.guru.
