OpenGeo

Introduction to PostGIS

Table Of Contents

Previous topic

Section 16: Advanced Performance

Next topic

Section 18: Advanced Exercises

Section 17: Advanced Functions

This section will provide a quick overview of some of the more advanced functions available in PostGIS.

Aggregate Functions

Aggregate functions are those that combine information from several different records as a single resulting record. The quintessential non-spatial example is the SUM function. For spatial data, PostGIS provides one absolutely critical function: ST_Union. In addition, there are a number of less common aggregates that can be useful.

ST_Union(geometry)
Merges all aggregated geometries into a single geometry. ST_Union(geometry) is the aggregate form, but it also supports ST_Union(geometryA, geometryB) to union two specified geometries together and ST_Union(geometry[ ]) to union an array of geometries.
SELECT ST_AsText(ST_Union(st_geomfromtext))
  FROM (SELECT ST_GeomFromText(
    'MULTIPOLYGON(((-77 56,-52 18,-88 -27,-10 -13,-11 38,-77 56)))')
  UNION ALL SELECT ST_GeomFromText(
    'MULTIPOLYGON(((-49 63,-32 24,-39 -7,-66 -19,-72 -9,-74 31,-49 63)))')
  ) as a;
_images/unionagg.png

Note

PostGIS also offers a more memory-friendly version of the union operator called ST_MemUnion. It is only available in the aggregate form and trades memory consumption for CPU time.

ST_Collect(geometry)
Similar to ST_Union(geometry), but the return value is a Multi-geometry or geometry collection, instead of dissolving the boundaries between geometries as ST_Union does. This makes it orders of magnitude faster. ST_Collect also has three forms, ST_Collect(geometry) which aggregates an arbitrary number of geometries, ST_Collect(geometryA, geometryB) that works on two geometries and ST_Collect(geometry[ ]) that collects an array of geometries.
SELECT ST_AsText(ST_Collect(the_geom))
  FROM (SELECT 'LINESTRING(0 0, 0 1)'::geometry the_geom
  UNION ALL
  SELECT 'LINESTRING(1 0, 1 1)'::geometry the_geom
  UNION ALL
  SELECT 'LINESTRING(0 0,1 0)'::geometry the_geom
  UNION ALL
  SELECT 'LINESTRING(1 1, 0 1)'::geometry the_geom) as a;
_images/collect.png
ST_Polygonize(geometry)
Generates a geometry collection containing all possible polygons from the input lines.
SELECT ST_AsText(ST_Polygonize(the_geom))
  FROM (SELECT 'LINESTRING(0 0, 0 1)'::geometry the_geom
  UNION ALL
  SELECT 'LINESTRING(1 0, 1 1)'::geometry the_geom
  UNION ALL
  SELECT 'LINESTRING(0 0,1 0)'::geometry the_geom
  UNION ALL
  SELECT 'LINESTRING(1 1, 0 1)'::geometry) as a;
_images/polygonize.png

De-aggregate Functions

PostGIS also supports the de-aggregation of data from one record into many.

ST_Dump(geometry)
Takes a multi-geometry or geometry collection and returns a set. The return value is not just a geometry object, rather it is a set of geometry_dump objects which consist of a geometry and an array of integers representing order of the resulting geometry within the input geometry, starting at 1. For most purposes, it is sufficient to select out only the geometry, like this:
SELECT ST_AsText((ST_Dump(the_geom)).geom)
  FROM jacksonco_taxlots
  WHERE gid = 90917;
_images/dump.png
ST_DumpRings(geometry)
Similar to ST_Dump, except that it breaks polygons into constituent rings, the closed linestrings that form the outer boundary and delineate the holes of a polygon. Rings are returned as polygons with no holes. The path field identifies exterior rings as 0, with all interior rings identified with their index within the geometry.

Set Operations

The queries in this section will illustrate their results using the following polygons.

_images/pointset01.png

These polygons are generated by the following geometries:

MULTIPOLYGON(((-77 56,-52 18,-88 -27,-10 -13,-11 38,-77 56)))
MULTIPOLYGON(((-49 63,-32 24,-39 -7,-66 -19,-72 -9,-74 31,-49 63)))

First, here is a query example using ST_Union(geometryA, geometryB)

SELECT ST_AsText(ST_Union(
  ST_GeomFromText(
    'MULTIPOLYGON(((-77 56,-52 18,-88 -27,-10 -13,-11 38,-77 56)))'),
  ST_GeomFromText(
    'MULTIPOLYGON(((-49 63,-32 24,-39 -7,-66 -19,-72 -9,-74 31,-49 63)))')));
_images/pointset02a.png _images/pointset02.png

In addition to ST_Union, there are two more point set operators available to PostGIS.

ST_Difference(geometryA, geometryB)
Returns the part of geometry A that does not intersect with geometry B. Note that the order of arguments will affect the results; no portion of the second geometry will be returned.
SELECT ST_AsText(ST_Difference(
  ST_GeomFromText(
    'MULTIPOLYGON(((-77 56,-52 18,-88 -27,-10 -13,-11 38,-77 56)))'),
  ST_GeomFromText(
    'MULTIPOLYGON(((-49 63,-32 24,-39 -7,-66 -19,-72 -9,-74 31,-49 63)))')));
_images/pointset03a.png _images/pointset03.png
ST_SymDifference(geometryA, geometryB)
Returns a geometry that is comprised of the portions of each input geometry that does not intersect with the other. Unlike ST_Difference the order of arguments does not matter. This is akin to a Boolean XOR operation.
SELECT ST_AsText(ST_SymDifference(
  ST_GeomFromText(
    'MULTIPOLYGON(((-77 56,-52 18,-88 -27,-10 -13,-11 38,-77 56)))'),
  ST_GeomFromText(
    'MULTIPOLYGON(((-49 63,-32 24,-39 -7,-66 -19,-72 -9,-74 31,-49 63)))')));
_images/pointset04a.png _images/pointset04.png
ST_Intersection(geometryA, geometryB)
Returns a geometry representing the region of intersection of the two input geometries. Not to be confused with the relational operator ST_Intersects. This function will return the exact opposite of ST_SymDifference.
SELECT ST_AsText(ST_Intersection(
  ST_GeomFromText(
    'MULTIPOLYGON(((-77 56,-52 18,-88 -27,-10 -13,-11 38,-77 56)))'),
  ST_GeomFromText(
    'MULTIPOLYGON(((-49 63,-32 24,-39 -7,-66 -19,-72 -9,-74 31,-49 63)))')));
_images/pointset05a.png _images/pointset05.png

Processing Functions

ST_Buffer(geometry, float distance, [segments])
Returns a geometry that contains all area within the given geometry as well as all area within a given distance of the geometry. The optional segments value defines the number of line segments that will be used to approximate a quarter of a circle. (Curved lines/polygons are approximated with line segments.)
SELECT ST_AsText(ST_Buffer(ST_GeomFromText(
    'LINESTRING(-2 -2,-2 2,2 2,2 4)'), 1));
_images/buffer2.png _images/buffer.png
ST_ConvexHull(geometry)
A convex hull is the minimum convex geometry that encloses all geometries within the input geometry. This is usually used with multi-geometries or geometry collections (such as those generated by ST_Collect), to generate a single enclosing geometry representing an area. The example below draws a line, and the result is a polygon surrounding all the points of that line.
SELECT ST_AsText(ST_ConvexHull(
    'LINESTRING(-2 -2,-2 2,2 2,2 4)'));
_images/convexhull2.png _images/convexhull.png
ST_SnapToGrid
Returns a geometry where each point in the input geometry each “snapped” to a grid, this grid defined by the other arguments in the function. When multiple consecutive points snap to the same grid location they are reduced to a single point, with geometries being reduced to null if there are not enough unique points remaining. This is useful for reducing the precision and consequently size of a geometry, or for forcing geometries to the same precision for reliable comparison. The four variants of ST_SnapToGrid are described below.
  • ST_SnapToGrid(geometry, size)

    Takes arguments for input geometry and grid size (can be an integer or float), and returns a new geometry with all points snapped to this grid size. The units of the geometry and grid size must match. Data for dimensions higher than 2 are not snapped. The origin of the grid is assumed to be (0,0).

  • ST_SnapToGrid(geometry, sizeX, sizeY)

    Allows different grid sizes for both the X and Y dimensions.

  • ST_SnapToGrid(geometry, originX, originY, sizeX, sizeY)

    Allows the use of a different origin other than (0,0), in addition to differing grid sizes in the X and Y dimensions.

  • ST_SnapToGrid(geometry, geometry origin, sizeX, sizeY, sizeZ, sizeM)

    Allows the origin to be defined by a point geometry, in addition to customizing grid size. This is the only form that supports snapping of dimensions higher than 2.

ST_Simplify(geometry, tolerance)
Generates a simpler geometry. This is done by selecting the endpoints of the line and the point farthest from the arc joining the end points, known as the “worst point”. If the worst point is farther from the joining arc than the tolerance value, it is kept and the substrings between the start and worst points, and worst and end points are considered recursively. In this way, a linestring comprised of enough of the original points can be produced that is “similar” to the original line. The units of tolerance are the same as the projection of the input geometry.
SELECT ST_AsText(geom) AS original,
  ST_AsText(ST_Simplify(geom, 3)) AS "3",
  ST_AsText(ST_Simplify(geom, 2.9)) AS "2.9"
FROM (SELECT ST_GeomFromText('LINESTRING(0 0,3 2.5,0  5)') AS geom) AS a;
_images/simplify.png

Notice how when the tolerance is set to 3, the middle point is removed, but when the tolerance is set to 2.9, that point is preserved.

ST_SimplifyPreserveTopology(geometry, float)
Performs essentially the same algorithm as ST_Simplify, but checks to ensure that the basic topology of the resulting geometry does not change. This prevents the creation of invalid geometries (such as self-intersections), which can sometimes occur when using ST_Simplify.
SELECT ST_AsText(geom) original,
  ST_AsText(ST_Simplify(geom, 2)) Simplify,
  ST_AsText(ST_SimplifyPreserveTopology(geom, 2)) PreserveTopology
FROM (
    SELECT ST_GeomFromText('POLYGON((0 0,0 1,1 1,1 0,0 0))') geom) AS a;
_images/simplifypreserve.png

Notice, in the above example, how each function returns a different result, but only ST_SimplifyPreserveTopology is a valid geometry.