Spatial joins are the bread-and-butter of spatial databases. This section will provide an initial introduction to the relation operators available and some common usage examples.
A spatial join is an operation used to combine two or more datasets with respect to a spatial relationship.
To retrieve a list of schools (jacksonco_schools) within the Medford city limits (medford_citylimits), you would use the following query:
SELECT name
FROM jacksonco_schools, medford_citylimits
WHERE ST_Within(jacksonco_schools.the_geom, medford_citylimits.the_geom);
This query checks each school to determine if it is within the city limits polygon. The ST_Within function has a fairly strict definition that requires each school to be completely within the city limits polygon. This means that schools that fall directly on the boundary are not technically within the polygon. This gets more interesting when comparing lines with polygons, as the end of the line landing on the boundary, or a section within the line laying tangent to the boundary are sufficient for the comparison to return false. Note that the ST_Within function includes the bounding box overlap operator (&&)
Next, let’s try to calculate the total length of road in the city of Medford (using ST_Length). We can use the ST_Within operator again, but with the requirement to be completely contained this will under-report the length of road. Let’s over-report instead by using ST_Intersects instead.
SELECT SUM(ST_Length(jacksonco_streets.the_geom))
FROM jacksonco_streets, medford_citylimits
WHERE ST_Intersects(jacksonco_streets.the_geom,medford_citylimits.the_geom);
Now lets bring some spatial data into the mix. The race table has a census-derived breakdown of population by race for census tracts stored in the tracts table. Let’s determine the expected ratio of races for each school in Jackson County.
SELECT jacksonco_schools.name, white_pop_1race * 1.0 / total_pop AS white_pop,
black_pop_1race * 1.0 / total_pop AS black_pop,
aindian_1race * 1.0 / total_pop AS indian_pop,
asian_1race * 1.0 / total_pop AS asian_popp,
hawaiian_1race * 1.0 / total_pop AS hawaiian_pop
FROM tracts, race, jacksonco_schools
WHERE tracts.ctidfp00 = race.geography_id2
AND ST_Intersects(tracts.the_geom, jacksonco_schools.the_geom);
Now we’re beginning to see the power and flexibility relational operators provide.
Note
Because the population numbers are stored as integers we need to force them to floating point numerics in order to calculate a decimal result. This can be done in a number of ways, including casting the columns using CAST white_pop_1race AS DOUBLE PRECISION). I’ve taken a shortcut here and simply included a decimal value (1.0) in the calculations.
Below are short descriptions of a number of relational operators. Each of these functions will return true or false. Later sections will make greater use of these functions. For a more complete description of each please refer to the PostGIS documentation [1].
Footnotes
| [1] | http://postgis.refractions.net/documentation/manual-1.4/ |