Geospatial data transformations

QGreenland Researcher Workshop 2023

Geospatial data transformations

  • Operations that change the data in some way.
  • Sometimes these changes can be dramatic. They can result in the loss of information in the source data.
  • Provenance is important! Always back up source data.

Motivation

It’s important that different datasets are compatible with each-other prior to analysis.

  • All datasets should be in the same CRS (datum and projection)
  • Rasters should be “co-registered”: matching grids (resolution and orientation)
  • Optional: Don’t keep data you don’t need
  • Change to data model (raster <-> vector)

What tool should I use?

The best one for the job. Explore the alternatives available in the ecosystem you want to work!

  • GUI-based GIS tools (QGIS!) are useful for visualization of data, especially in comparison with other layers like a basemap.
  • Command-line tools are especially useful for getting a quick answer.
  • Language-specific (e.g., Python) tools are good for automations or research code.

Resampling

Generation of a new sample from existing data.

Raster Resampling

gdalwarp

Raster grids will not always align, even in the same projection & datum.

Resampling: interpolation

Resampling: vectors

ogr2ogr

Change the frequency/density of verticies in lines and polygons

  • segmentize
  • simplify

Reprojection

Reprojection (via PyGIS)

Raster Reprojection

gdal_warp

When raster data is reprojected, data points are usually not mapped 1:1.

A “warped” reprojection (via PyGIS)

Vector reprojection

ogr2ogr

Vector reprojection only affects vertices / points.

QGreenland’s “Greenland-focused boundary” polygon

Reprojected to EPSG:4326

Reprojection pitfalls

Weird things happen at the edges!

  • Vector geometry can become invalid.
  • Raster data can show a “seam” where edges are warped together.

A “seam” along 180 degrees longitude in QGreenland’s Natural Earth basemap

On-the-fly reprojection

Subsetting (clipping)

gdalwarp, ogr2ogr

Consider doing a subset first. Don’t waste time and computing power doing analysis on areas you don’t care about!

QGreenland’s Natural Earth 2 basemap without subsetting. QGreenland’s area of interest is the tiny circle in the center!

Conversion

gdal_rasterize, gdal_polygonize.py, gdal_contour

Does your data provide a suitable representation for your analysis?

Johannes Rössel, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons

Conversion: Vector to raster

Spatial interpolation

(a) Ice thickness map from kriging the Lamont-Doherty (black lines) and CReSIS data (grey lines). (b) Bed topography calculated from subtracting the interpolated ice thickness from the GIMP surface elevation DEM. A in Figure 2b highlights the subglacial overdeepening where major water rerouting occurs.

Conversion: Raster to vector

QGreenland’s Heat Flux layer (raster) with contours overlaid

General transformation pitfalls

  • Some metadata in the source data may be carried over to the output. Use tools like gdalinfo or ogrinfo to inspect your metadata before publishing.