Friday, June 3, 2011

Persisted Spatial Calculation Columns

This blog post discusses how I computed the area values of the features in my spatial tables to improve query/reporting performance during my investigation of SQL Server spatial indexes.

As part of my recent spatial index investigation, I performed some analysis on a number of data sets to get a feeling for their spatial density to improve decision making about settings for my spatial indexes.


This analysis involved investigation of relative area of features compared to the entire data set, and data extents, which required summarisation of area values of varied selections of features.


Relatively speaking, the act of performing a calculation of the area of a polygon feature using the STArea method is computationally expensive, so aggregating the values returned from STArea across all data in a table, or performing a selection using the STArea function as a criteria in a filter (e.g. select * from cadastre where geom.STArea() < 2000) can impact performance dramatically.  Considering that some of my data sets contained millions of rows of data, this wasn't really desirable.


Persisted Computed Columns


One of the features introduced in SQL Server 2005 was Persisted Computed Columns.  These columns allow you to define a computed column, but rather than calculating the value of the column when queries are executed, the computed column value is saved in the database.  The computed column value is updated when the columns that the value is derived from are updated.


The computed column will then be much faster to perform reporting and queries over.


Here is an example of a table that uses a persisted computed column to store the area (area column) of each polygon in the table, and updates whenever the geometry (geom column) is updated.


create table cadastre(
    geom geometry not null,
    area as (geom.STArea()) persisted not null
)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.