OpenGeo

Introduction to PostGIS

Table Of Contents

Previous topic

Section 14: Visualisation

Next topic

Section 16: Advanced Performance

Section 15: Validity and Equality

Validity

PostGIS conforms to the Simple Features for SQL (SFSQL) definition for valid features and uses these details in order to optimize a variety of functions. Despite this restriction, PostGIS will attempt to load any feature it is given whether it is valid or not. This sometimes results in undefined behavior when using the analytical functionality of PostGIS. If your query has gone wrong, the first step to understanding why is to determine if your geometries are valid.

The following query uses the ST_IsValid function to check the geometries of a given table (jacksonco_taxlots).

SELECT count(*), ST_IsValid(the_geom)
  FROM jacksonco_taxlots
  GROUP BY ST_IsValid;
_images/start10-11.png

In this case there is only one invalid geometry; the Notice provides a short summary of which topological requirement has been violated and will allow you to investigate and correct the problem. In cases with hundreds of violations, it is not so easy to track them all down.

While PostGIS provides a variety of functions for the manipulation of geometries, there are powerful tools to correct invalid geometries efficiently. One trick that is useful for many topological problems is to perform a buffer operation using a buffer distance of 0, as shown below.

UPDATE jacksonco_taxlots
  SET the_geom = ST_Multi(ST_Buffer(the_geom, 0));

SELECT count(*), ST_IsValid(the_geom)
  FROM jacksonco_taxlots
  GROUP BY ST_IsValid;
_images/start12.png

In the Update query above, we are doing more than simply buffering. Looking at the definition of the geometry we can see that it’s described as a “Multipolygon”. The ST_Buffer operation will generate the most topologically simple geometry it can. In this case one of the Multipolygons contains only one Polygon and as such the ST_Buffer operation returns a Polygon object. Since this violates the geometry type constraint, we wrap the buffer function in ST_Multi. This ensures that any geometry is returned as its multi-type; “Point”, “Linestring”, and “Polygon” become “Multipoint”, “Multilinestring”, and “Multipolygon” respectively.

Equality

Determining equality when dealing with geometries can be tricky. PostGIS supports three different functions that can be used to determine different levels of equality, though for clarity we will use the definitions below. To illustrate these functions, we will use the following polygons.

_images/polygon-table.png

These polygons are loaded using the following commands.

CREATE TABLE polygons (name varchar, poly geometry);

INSERT INTO polygons VALUES
  ('Polygon 1', 'POLYGON((-1 1.732,1 1.732,2 0,1 -1.732,
      -1 -1.732,-2 0,-1 1.732))'),
  ('Polygon 2', 'POLYGON((-1 1.732,-2 0,-1 -1.732,1 -1.732,
      2 0,1 1.732,-1 1.732))'),
  ('Polygon 3', 'POLYGON((1 -1.732,2 0,1 1.732,-1 1.732,
      -2 0,-1 -1.732,1 -1.732))'),
  ('Polygon 4', 'POLYGON((-1 1.732,0 1.732, 1 1.732,1.5 0.866,
      2 0,1.5 -0.866,1 -1.732,0 -1.732,-1 -1.732,-1.5 -0.866,
      -2 0,-1.5 0.866,-1 1.732))'),
  ('Polygon 5', 'POLYGON((-2 -1.732,2 -1.732,2 1.732,
      -2 1.732,-2 -1.732))');
_images/start13.png

Exactly Equal

Exact equality is determined by comparing two geometries, vertex by vertex, in order, to ensure they are identical in position. The following examples show how this method can be limited in its effectiveness.

SELECT a.name, b.name, CASE WHEN a.poly ~= b.poly
    THEN 'Exactly Equal' ELSE 'Not Exactly Equal' end
  FROM polygons as a, polygons as b;
_images/start14.png

In this example, the polygons are only equal to themselves, not to other seemingly equivalent polygons (as in the case of Polygons 1 through 3). In the case of Polygons 1, 2, and 3, the vertices are in identical positions but are defined in differing orders. Polygon 4 has colinear (and thus redundant) vertices on the hexagon edges causing inequality with Polygon 1.

Spatially Equal

As we saw above, exact equality does not take into account the spatial nature of the geometries. There is an function, aptly named ST_Equals, available to test the spatial equality or equivalence of geometries.

SELECT a.name, b.name, CASE WHEN ST_Equals(a.poly, b.poly)
    THEN 'Spatially Equal' ELSE 'Not Equal' end
  FROM polygons as a, polygons as b;
_images/start15.png

These results are more in line with our intuitive understanding of equality. Polygons 1 through 4 are considered equal, since they enclose the same area. Note that neither the direction of the polygon is drawn, the starting point for defining the polygon, nor the number of points used are important here. What is important is that the polygons contain the same space.

Equal Bounds

Exact equality requires, in the worst case, comparison of each and every vertex in the geometry to determine equality. This can be slow, and may not be appropriate for comparing huge numbers of geometries. To allow for speedier comparison, the equal bounds operator, =, is provided. This operates only on the bounding box (rectangle), ensuring that the geometries occupy the same two dimensional extent, but not necessarily the same space.

SELECT a.name, b.name, CASE WHEN a.poly = b.poly
    THEN 'Equal Bounds' ELSE 'Non-equal Bounds' end
  FROM polygons as a, polygons as b;
_images/start17.png

As you can see, all of our spatially equal geometries also have equal bounds. Unfortunately, Polygon 5 is also returned as equal under this test, because it shares the smae bounding box as the other geometries. Why is this useful, then? Although this will be covered in detail later, the shot answer is that this enables the use of spatial indexing that can quickly reduce huge comparison sets into more manageable blocks when joining or filtering data.