%matplotlib widget
# Set up matplotlib interactive "magic"
Solution: Raster data needs reprojection
📂 Open the source data (in EPSG:3413
)
import rioxarray
import matplotlib.pyplot as plt
import numpy as np
from rasterio.enums import Resampling
from cartopy import crs as ccrs
= rioxarray.open_rasterio('../dem_with_altered_metadata.tif')
epsg_3413 epsg_3413
<xarray.DataArray (band: 1, y: 561, x: 301)> [168861 values with dtype=float32] Coordinates: * band (band) int64 1 * x (x) float64 -8e+05 -7.95e+05 -7.9e+05 ... 6.95e+05 7e+05 * y (y) float64 -6e+05 -6.05e+05 -6.1e+05 ... -3.395e+06 -3.4e+06 spatial_ref int64 0 Attributes: AREA_OR_POINT: Area TIFFTAG_XRESOLUTION: 1 TIFFTAG_YRESOLUTION: 1 scale_factor: 1.0 add_offset: 0.0 long_name: band_data
Plot the source data
plt.figure()= plt.imshow(epsg_3413[0])
plot
'EPSG:3413')
plot.axes.set_title(
plt.colorbar() plt.show()
🌐 Reproject the dataset
= epsg_3413.rio.reproject(
epsg_4326 "EPSG:4326",
=Resampling.bilinear,
resampling )
Plot it again!
=(10, 2))
plt.figure(figsize
= plt.imshow(epsg_4326[0])
plot 'EPSG:4326')
plot.axes.set_title(
plt.colorbar() plt.show()
/home/mfisher/.miniconda3/envs/geotools/lib/python3.11/site-packages/matplotlib/image.py:505: RuntimeWarning: overflow encountered in multiply
A_resampled *= ((a_max - a_min) / frac)
⁉️ What’s going on?
This result is clearly wrong. What is this shape? Why is the data “gone”, and what does the yellow color represent?
Let’s take a look at some of these values to get a better idea of what we are seeing.
print(f"===Pre-reprojection===")
print(f"Shape: {epsg_3413[0].shape}")
print(f"Top-left corner value: {epsg_3413[0][0][0].data}")
print(f"Metadata NoData value: {epsg_3413[0].rio.nodata}")
print()
print(f"===Post-reprojection===")
print(f"Shape: {epsg_4326[0].shape}")
print(f"Top-left corner value: {epsg_4326[0][0][0].data}")
print(f"Top-center value: {epsg_4326[0][0][int(epsg_4326[0].shape[1] / 2)].data}")
print(f"Metadata NoData value: {epsg_4326[0].rio.nodata}")
===Pre-reprojection===
Shape: (561, 301)
Top-left corner value: 9999.0
Metadata NoData value: None
===Post-reprojection===
Shape: (240, 955)
Top-left corner value: 3.4028234663852886e+38
Top-center value: 9999.0
Metadata NoData value: 3.4028234663852886e+38
/home/mfisher/.miniconda3/envs/geotools/lib/python3.11/site-packages/rioxarray/raster_writer.py:130: UserWarning: The nodata value (3.402823466e+38) has been automatically changed to (3.4028234663852886e+38) to match the dtype of the data.
warnings.warn(
We are seeing the shape of the extent of the original data in purple. The original rectangular shape had to be warped to align to the geographic projection grid, creating this “funnel” shape. Outside that area, the yellow color is a very large number being used as a NoData value.
It turns out that your colleagues’ modification to this dataset was to set the NoData pixels to 9999
, but they forgot to correctly specify that 9999
is this dataset’s NoData flag value in the metadata.
Above, we can see that 9999
is being used as the NoData value for the source dataset by inspecting the top-left pixel. We can also see that the NoData value is not set in the metadata. After reprojecting we have a new NoData value (the maximum possible Float32
value) being added and being set in the metadata.
= 9999 expected_nodata
🥸 Masking the NoData values
We expect the NoData value to be 9999
, so let’s mask it out.
= epsg_4326[0]
epsg_4326_da = epsg_4326_da.where(epsg_4326_da != expected_nodata) epsg_4326_da_masked_9999
Plot!
=(10, 2))
plt.figure(figsize= plt.imshow(epsg_4326_da_masked_9999, interpolation='nearest')
plot 'EPSG:4326')
plot.axes.set_title(
plot.axes.grid()
plt.colorbar() plt.show()
⁉️ What’s going on?
We’ve successfully masked out the original NoData values, but that only applied to the NoData values used in the original dataset. This plot is showing two NoData values: 9999
from the original dataset, and the maximum Float32
value for pixels outside of the original dataset.
We can mask those out, too:
= epsg_4326_da_masked_9999.where(epsg_4326_da_masked_9999 <= expected_nodata) epsg_4326_da_masked_all_nodata
=(10, 2))
plt.figure(figsize= plt.imshow(epsg_4326_da_masked_all_nodata)
plot 'EPSG:4326')
plot.axes.set_title(
plot.axes.grid()
plt.colorbar() plt.show()
At a glance, things look OK, but on closer inspection this has uncovered another issue! Zoom in on the coastline of Greenland. The edge pixels should not have higher values than interior pixels; the ice sheet is thinner at the edges. These edge pixels were interpolated during the original reprojection because the NoData value was not appropriately set – normally, NoData pixels are not used during interpolation. Looks like we need to go back even further to fix this, rather than masking the problem.
🎉 Reproject the dataset; 2nd try
First, correctly set the NoData value on the source and mask out the nodata values. Then reproject.
# Replace the nodata value
= epsg_3413.where(epsg_3413 != expected_nodata)
epsg_3413_fixed = epsg_3413_fixed.rio.write_nodata(expected_nodata, encoded=True)
epsg_3413_fixed print(f"NoData encoded as: {epsg_3413_fixed.rio.encoded_nodata}")
= epsg_3413_fixed.rio.reproject(
epsg_4326_fixed "EPSG:4326",
=Resampling.bilinear,
resampling )
NoData encoded as: 9999.0
=(10, 2))
plt.figure(figsize= plt.imshow(epsg_4326_fixed[0])
plot 'EPSG:4326')
plot.axes.set_title(
plot.axes.grid()
plt.colorbar() plt.show()