Spatial Hexagon Binning in POSTGIS

Hexagon binning is an important tool for visualizing spatial relationships but it’s a pain to implement. Current solutions involve partial implementations for PostGIS and work arounds based on generating hexagons in QGIS with the MMQGIS plugin. In this brief post, I provide a SQL function for quickly generating hexagon polygon layers in PostGIS of any [...]

Parsing XML to a Data Frame: Recovering the Worldwide Incidents Tracking System (WITS)

The Worldwide Incidents Tracking System (WITS) was a database of global terrorism events compiled by the National Counter-terrorism Center (NCTC) until 2012. At the end it contained 68,939 records with a short synopsis of each event and is thus still an interest to conflict scholars. Unfortunately, it’s now defunct and getting a copy can be [...]


In a follow up to my post about generating lots of real world random data in R, in this brief post I show how to generate lots of realistic functions. By sampling from the PDF and CDF of real world data you can quickly generate all manner of continuous and step functions for further experimentation.


Quickly Generating Lots of Realistic Random Data in R

In this brief post, I show a trick for quickly assembling arbitrarily large samples of real world data by sampling from all of the data sets included in R packages.


Parsing XML files to a flat dataframe

Markup languages like XML are really handy for structured data that can have multiple values for the same attribute, or attributes which are nested within other attributes in a hierarchical structure. For simple analysis, however, we just want a rectangular data-frame with columns and rows and we need to flatten all that structure. The following [...]

Fast Spatial Joins in Python with a Spatial Index

I often have to execute spatial joins between points and polygons, of say bombing events and the boundaries of the district they took place in. Quantum GIS uses ftools to execute these kinds of spatial joins, but failed on on a relatively modest join of 40k points and 9k boundaries. I could do the join [...]

R Speed Gains

For those looking to get more speed out of their R code check out this post on using C++ directly in R through the rcpppackage, and compiling R code through the new compiler package which is coming out in R 2.13.0.

Python 64bit on Windows

Getting 64bit Python up and running with 64 bit packages on Windows is a bit of a pain.

Install WinPython 64bit,

Register the installation with windows with “WinPython Control Panel.exe” by going “Advanced” -> “Register distribution”

Now find a 64bit compilation of the package you want. Christoph Gohlke provides a major public good [...]

Python 64 Bit on Windows Part 2, Building Packages

Building packages packages for 64 bit python is straightforward once you figure out the main steps.

First, download and set up the Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1 as explained here.

You have to execute the SDK’s command shell, which will have a shortcut in Start -> All Programs -> [...]