Monday, April 9, 2012

Faster Voronoi Diagrams in PostGIS


A couple years back, I posted a quick and dirty function for generating voronoi diagrams in PostGIS. As one of the commenters pointed out, this is not the world's most efficient algorithm. So I started looking into implementing a better one myself, but once again, being a virtuous programmer, I borrowed someone else's code, and modified it to work with PostGIS.

This one's in Python, not PL/pgSQL, so plan appropriately.  It is called similarly to my previous version:

select * from voronoi('table_name', 'geom_column') as (id integer, the_geom geometry);

You no longer need to specify an id column, but the downside is that there's no longer a relationship between the id of the input point, and the id of the output polygon. On the other hand, it's much, much, much more efficient. At least a couple orders of magnitude. In testing, my 40,000 points of input only took 20 seconds to calculate, whereas previously it took... well, I don't know, because I didn't want to wait that long.

Code is here: http://db.tt/xen7XHwC  No guarantees it actually does anything correctly, but my limited testing seems to work. Enjoy.

UPDATE: the code linked above only works with Python 2.x. If you look through the comments, you'll see there's a link to a version of my code for 3.x, but it still has a couple of bugs which I have not had time to investigate.

UPDATE 2: See the comments below about the QGIS code also producing incorrect output with this function. I'll try to investigate in my copious free time.

UPDATE 3: Matthias has published a version that corrects the problems in my initial code. His dropbox link is here: http://db.tt/gwEGpu0Y.  I've also published it as GIST, here: https://gist.github.com/darrell/6056046


39 comments:

  1. ok can I directly use this in a normal postgres / postgis server or do I need to install something like python..?

    ReplyDelete
    Replies
    1. Yes, you will need to load PL/Python as a language in your database. See: http://www.postgresql.org/docs/8.4/static/plpython.html

      Delete
  2. ok. this is a small howto for installing plpython in a postgres 9.1 database on a postgres 9.1 server installed on ubuntu natty from the package by enterprisedb:

    Get the plphyton2.so file from

    http://packages.debian.org/de/squeeze-backports/i386/postgresql-plpython-9.1/download

    copy it to your Postgresql/9.1/lib/postgresql dir

    chown it to daemon:root
    chmod it to 755

    run "create procedural language 'plphytonu'" on your db

    ReplyDelete
    Replies
    1. On Ubuntu 12.10 (quantal) for postrgres 9.1 (the last version with support for postGIS 2.0 as of the time of this writing) you can get it installing the ubuntugis repository[1] and the via apt:
      sudo apt-get install postgresql-plpython-9.1
      and then[2] in your database you can:

      CREATE EXTENSION plpythonu;

      HTH,

      Cristian
      [1] https://wiki.ubuntu.com/UbuntuGIS
      [2] http://www.postgresql.org/docs/9.1/static/plpython.html

      Delete
  3. Adding the voronoi function and plpython to my database both seems to work, but i keep on getting these messages whatever I'm trying:

    PL/Python functions cannot return type record

    and

    a column definition list is required for functions returning "record"

    Any suggestions?

    Thanks in advance,
    Raymond

    ReplyDelete
    Replies
    1. Make sure you're calling like I have defined above. You need to give it a column definition list for the record type, thence the "as (id integer, the_geom geometry)" section above.

      Delete
    2. This comment has been removed by the author.

      Delete
  4. In glancing at your code, I'm glad to see I'm not the only person who accidentally typos "pythong" :) (er, unless there's a legit meaning to that which I'm not aware of)

    ReplyDelete
    Replies
    1. Hmm.. no, maybe it's just an indication of what I'd rather be thinking about instead of coding..

      Delete
  5. I believe I have everything installed right, maybe I can't use the table that I'm running this against? IT's just a table of id and geom (points)

    When I run this code I get the following error:

    select * from voronoi('s_test', 'geom') as (id integer, geom geometry);

    error fetching next item from iterator
    DETAIL: spiexceptions.InternalError: Argument to X() must be a point
    CONTEXT: Traceback (most recent call last):
    PL/Python function "voronoi"

    ReplyDelete
    Replies
    1. Hmm... do you have fields with a NULL geometry, perhaps? Or maybe something's actually a MULTIPOINT? I'm not sure I tested with multipoint data...

      Delete
  6. Hi I am trying to run the voronoi function create script with plypython3u installed in Postgres9.1. on Windows 7 64bit (create language plpython3u language ran sucessfully after having installed Activestate Python in the first place).

    But now when I run the create function sql it gives me the error message:

    ERROR: could not compile PL/Python function "voronoi"
    DETAIL: SyntaxError: invalid syntax (, line 70)

    It worked on plpython2u on linux though...

    Any Hints?

    ReplyDelete
    Replies
    1. Ahh, it's because the print command changed in python3. See: http://docs.python.org/release/3.0.1/whatsnew/3.0.html

      It should be pretty easy to fix this with the 2to3 command, which updates the syntax from python2 to python3. I ran it as a test, and the changes look pretty minor, but I'd want to do more testing than I can do right away, since it made some changes to the iterator.

      A quick and probably sufficient fix is to go and replace every instance of 'print X % Y' to 'print(X % Y)'.

      If you want to just try the updated (but untested) version, you can grab it here: http://db.tt/mxanqeEn

      Let me know how it goes!

      Delete
    2. Ok now it compiles fine in postgres
      but when running it with a point geotable, we got this message:

      ERROR: error fetching next item from iterator
      DETAIL: TypeError: unorderable types: Site() < Site()
      KONTEXT: Traceback (most recent call last):
      PL/Python function "voronoi", line 648, in
      self.__sites.sort()
      PL/Python function "voronoi"

      We are trying to debug , but we seem not to be able to get the messages displayed in PGADmin (neither in Dbvisualizer...)


      Could you be so kind and provide some guidance?

      Thanks a lot!

      Delete
  7. ok we have debugged now using the hints from http://python3porting.com/problems.html


    The function now compiles and runs without error message, but always returns an empty result (no rows)

    Code is here:

    https://dl.dropbox.com/u/316947/voronoip3.sql

    ReplyDelete
  8. Further steps towerd solution (not achieved yet, any help welcome):

    The problem is the move in Python3 from __cmp__ method towards __lt__ , __gt___ and so on. This affects also the sorting (look at self.__sites.sort() ).

    I have now replaced occurrences of cmp(a, b) with (a > b) - (a < b) and __cmp__ by __lt__, which compiles but when I run the function it gives me the error message:

    ERROR: error fetching next item from iterator
    DETAIL: spiexceptions.InternalError: lwline_deserialize: attempt to deserialize a line which is really a GeometryCollection
    CONTEXT: Traceback (most recent call last):
    PL/Python function "voronoi_p3cmp"

    CODE here: https://dl.dropbox.com/u/316947/voronoip3cmp.sql


    Any help is very very welcome ;-) !

    ReplyDelete
    Replies
    1. Hmm... that *sounds* like you have multipoints or other non-point data in there, but I think I may have run into this before.

      I need to rebuild postgres with python3, so it'll be a bit before I can do some more testing, but I'll try to take a look at it today.

      Delete
    2. OH, OK, so one big problem is that you can't just rename __cmp__ to __lt__ - you need to define the full set of functions.

      For example, see the replacements below.

      There's something else going on though, and I'm not sure what. I think it might be related to PostGIS 2, but I'll have to do some more digging. Most of time I'm getting good results, but not always.


      ====
      def __lt__(self,other):
      if self.y < other.y:
      return True
      elif self.x < other.x:
      return True
      else:
      return False

      def __gt__(self,other):
      if self.y > other.y:
      return True
      elif self.x > other.x:
      return True
      else:
      return False

      def __eq__(self,other):
      if self.y == other.y and self.x == other.x:
      return True
      else:
      return False

      def __ne__(self,other):
      return not __eq__(self,other)
      ====

      def __lt__(self,other):
      if self.ystar < other.ystar:
      return True
      elif self.vertex.x < other.vertex.x:
      return True
      else:
      return False

      def __gt__(self,other):
      if self.ystar > other.ystar:
      return True
      elif self.vertex.x > other.vertex.x:
      return True
      else:
      return False

      def __eq__(self,other):
      if self.ystar == other.ystar and self.vertex.x == other.vertex.x:
      return True
      else:
      return False

      def __ne__(self,other):
      return not __eq__(self,other)

      Delete
  9. just tried, no multipoints in my table...even set the constraint of geometrytype explicitly to accept only 'POINT'....no success.

    But thanks for helping us anyway...together we'll do it ..;-)

    ReplyDelete
  10. Oh yeah, I thought something like this too, but my python skills are very very limited...so I didnt touch that.

    Just to inform you, we are also using PostGIS 2.0 here..

    ReplyDelete
  11. ok I implemented the code from your 2.05 PM post, still not working.

    What did you do about the occurences of

    cmp(something,somethingelse)

    in the code?

    ReplyDelete
    Replies
    1. Here's the version I was working with, so you can see the cmp() changes. It fixes the sorting issues, but there seems to be a different hiccup – probably involving the iterator, that I need to dig into some more. http://db.tt/4nrdorXm

      Delete
  12. This comment has been removed by the author.

    ReplyDelete
  13. hmm I get thiserror message now

    Self-intersection at or near point ...
    CONTEXT: SQL-Anweisung »
    SELECT
    CASE WHEN ST_IsValid(a.the_geom) AND ST_Overlaps(ST_Multi('...'::geometry), a.the_geom)
    THEN
    ST_Multi(ST_Intersection('...'::geometry, a.the_geom))
    ELSE
    a.the_geom
    END as the_geom
    FROM (SELECT st_multi(ST_Polygon($1,4326)) as the_geom) a
    «
    PL/Python function "voronoi_p3cmp"
    HINWEIS: Self-intersection at or near point ...
    KONTEXT: SQL-Anweisung »
    SELECT
    CASE WHEN ST_IsValid(a.the_geom) AND ST_Overlaps(ST_Multi('...'::geometry), a.the_geom)
    THEN
    ST_Multi(ST_Intersection('...'::geometry, a.the_geom))
    ELSE
    a.the_geom
    END as the_geom
    FROM (SELECT st_multi(ST_Polygon($1,4326)) as the_geom) a
    «
    PL/Python function "voronoi_p3cmp"

    or also this

    spiexceptions.InternalError: Shell is not a line

    with a smaller point set

    ReplyDelete
  14. Hi, I was looking for a possibility to write you a PM, but didnt find it.. As we are very dependent on running the faster python voronoi function on python3u, we would like to politely ask you if you think the transfer to python3 can be realized and if you are still working on it (Unfortenately we are kind of tuck due to limited Python knowledge here). Kind greetings and many thanks for your work until now...

    ReplyDelete
  15. hey can you contact me at raliski@gmail.com...its urgent here to get voronoi working on Postgis 2.0 / Python3 combination..thanks!

    ReplyDelete
  16. hi also under plpython2u, I get some weird notices, and also some very weird voronois geometries far outside the extents of the poitn cloud used for the voronois:

    NOTICE: Self-intersection at or near point -x y
    CONTEXT: SQL statement "
    SELECT
    CASE WHEN ST_IsValid(a.the_geom) AND ST_Overlaps(ST_Multi('xxx'::geometry), a.the_geom)
    THEN
    ST_Multi(ST_Intersection('xxx'::geometry, a.the_geom))
    ELSE
    a.the_geom
    END as the_geom
    FROM (SELECT st_multi(ST_Polygon($1,4326)) as the_geom) a
    "

    could this have to do with the voronoi polygons being created as multipolygons inspite of the fact thata voronoi can only be a single polygon at least when it is created for a set single points?

    ReplyDelete
  17. This comment has been removed by the author.

    ReplyDelete
  18. ok on my above comment I can say that changing to single polygons doesnt make a difference.

    As the voronoi function is borrowed from QGIS code I want to add the following: When I take a point set and run the voronoi creation on QGIS (1.8) , the extent of the resulting shapefile is always rectangular, while the extent of the geometries created by the plpythonu function is not rectangular, in spite of the function doing a clipping to the extent (almost at the end of the code...)

    Isnt there a Python developer with some spare time out there who could tackle the problem? I would be very veryvery thankful!

    Greetings!

    ReplyDelete
  19. After trying the plpythonu Voronoi function and producing artefacts and overlapping and self-overlapping voronoi polygons all the time (same happens if you use the ftools voronoi tool in QGIS, as the plpython code is derived from this), I tried out the PL/R voronoi function from http://punkish.org/Voronoi-Diagrams-In-PostGIS and found out that after some small adaptions to postgis 2 (make srid()-->st_srid() creating a intersects() function (otherwise you get the error "function st_intersects(geometry, text) is not unique"), the function produced a flawless set of voronoi polygons for 1000 points in about 13 seconds on a Core2duo CPU. I can only recommend to directly go for the PL/R function for voronois on large point sets. Only tested on Linux though ;-)

    ReplyDelete
    Replies
    1. I didn't know that the QGIS code would produce errors, too. That's good to know, as I haven't had time to dive into a couple problems people have reported. (I did say it was quick and dirty, after all.) I'll have to run some comparisons between the output from QGIS and that from the PostGIS function to satisfy myself that it's not my fault. :) Then maybe I can dig into it and find out what's going on.

      Thanks for the heads up.

      Delete
    2. Hi, I also changed intersects() to st_intersects() and got the "function is not unique" error. I don't understand your comment where you said that you created an intersects() function [to get around this problem]. Do you mean your own version of intersects()? Using what? Could you explain what you are doing here. Thank you.

      Delete
    3. hey you have to create another function called intersects(). i.e. you need two functions with the same content: st_intersects() and intersects().Then it is working.Not very logical I know, but it does the trick ;-)

      Delete
  20. This comment has been removed by the author.

    ReplyDelete
  21. Hi,

    thanks for your work, it is very helpful for what I'm trying to achieve!

    However, I cannot confirm the finding that the QGIS fTools plugin produces the same incorrect output as this version. In my use case, I run the voronoi algorithm on a rather dense point data set (approx. 1000 points closely located to each other).
    While the QGIS plugin yields flawless and clean results, this plpython method lacks about 40% of the voronoi polygons after execution. The log shows the already mentioned Self-intersection errors.

    Did you meanwhile had the time to look on the details why this erroneous output could be produced?
    Any help is appreciated!

    Thanks a lot and best regards,
    Matthias

    ReplyDelete
    Replies
    1. Hi again,

      I managed to have a closer look at the code and found that somehow all the PostGIS based clipping and polygon creation seems to be too unstable. I wasn't able to produce 100% clean output using the PostGIS voronoi post-processing.

      So what I did is just additionally incorporated the "clip_voronoi" code which is also used in the QGIS fTools plugin for correctly clipping the lines and voila - it worked :-)

      Here is the modified code: http://db.tt/gwEGpu0Y
      I tested it with various input data and it produced correct and clean output in all cases.

      Hope it helps! Not much of this is my work though so credits go to both GeoGeek and the fTools authors, whose work I only was able to put together correctly.

      Cheers,
      Matthias

      Delete
    2. Thanks, Matthias, that's fabulous work. Thanks a ton. I'll post an update to the blog.

      Delete
  22. I'm a bit late into this, but I'm running Matthias's improved version and it looks like the identity of rows gets lost?

    It does a:
    SELECT DISTINCT st_x(%s) as x, st_y(%s) as y FROM %s

    and then numbers everything sequentially when generating a result.

    But that SQL query has no ORDER BY, so the results just come back at the whim of the database (per the Postgresql manual - there is probably an implementation dependent pattern).

    I think it really needs to take an id column name that comes back in the result?

    ReplyDelete
  23. Hi, I'm a little bit late also...

    I try to generate voronoi polygons with plpython3 and it's not working...

    I've taken the last code by Matthias and I'm trying to update it for plpython3 but I'm a newbie on python so it's quite hard.

    I tried to use the sorting implementation as in this code :
    http://db.tt/4nrdorXm
    But no success.

    The polygons obtained are not correct and I don't have any idea for the moment.

    Btw, I think the last comment by Rich is important too.

    Since last year, have you solved the problem with plpython3 ?

    Thanks for your attention.
    Deneb.

    ReplyDelete