**diff options**

author | Casey Robinson <kc@rampantmonkey.com> | 2014-01-19 23:09:59 -0500 |
---|---|---|

committer | Casey Robinson <kc@rampantmonkey.com> | 2014-01-19 23:30:14 -0500 |

commit | 686c7da1c11691e5769c91310dda48c4eee5ced8 (patch) | |

tree | 2c1af64ce4414d2dd155c767bea69f448f66bf88 | |

parent | fece3d5c81aea1b45b34b1d4dcc0b7b11e14f3b8 (diff) | |

download | neighborhoods-686c7da1c11691e5769c91310dda48c4eee5ced8.tar.gz neighborhoods-686c7da1c11691e5769c91310dda48c4eee5ced8.tar.bz2 neighborhoods-686c7da1c11691e5769c91310dda48c4eee5ced8.zip |

fill in some thoughts about the project in the intro

-rw-r--r-- | doc/intro.md | 53 | ||||

-rw-r--r-- | doc/ray_casting_example.png | bin | 0 -> 40482 bytes |

2 files changed, 52 insertions, 1 deletions

diff --git a/doc/intro.md b/doc/intro.md index 4be4bc5..fb7e56a 100644 --- a/doc/intro.md +++ b/doc/intro.md @@ -1,3 +1,54 @@ # Introduction to neighborhoods -TODO: write [great documentation](http://jacobian.org/writing/great-documentation/what-to-write/) +The goal of neighborhoods is to determine the neighborhood that includes a given address. +The address is given as a pair of GPS coordinates representing the latitude and longitude of the desired address. +Below is a snapshot of the neighborhoods of Grand Rapids, Michigan to provide context. + +![Neighborhoods of Grand Rapids, MI](neighborhood_map.png) + +## Approach + +The core algorithm used by neighborhoods is the algorithm for determining if the point inside a neighborhood or not. +Start with the point of interest. +Draw a [semi-infinite](http://en.wikipedia.org/wiki/Semi-infinite) line (line extending to infinity in only one direction) from the point of interest to infinity along the x axis. +Count the number of times this line intersects a line segment which is part of the border. +If the total is even, the point is outside. +If odd, the point is inside. +This +Proof of correctness is left as an exercise to the reader. + +This technique is known as [__ray casting__](http://en.wikipedia.org/wiki/Ray_casting). +The diagram below demonstrates the ray casting method. + +![ray casting example](ray_casting_example.png) + +Ray casting for point location requires a check of each line segment in the border. +While we cannot improve the [asymptotic complexity](http://en.wikipedia.org/wiki/Asymptotic_computational_complexity) of the approach, we can reduce the runtime by avoiding unnecessary computation. +Before doing any geometry we can perform a simple check, only consider line segments which have a y range covering the point of interest. +This is only two comparisons, while the geometry requires significantly more work - one comparison, one addition, three subtractions, two multiplications, and one division. + +Once we know how to determine the relative position of points and polygons, we need to decide which neighborhoods to check. +The most obvious case is to perform a linear search through all neighborhoods. +The linear search will on average require checking `n/2` neighborhoods, where `n` is the total number of neighborhoods. +Since it forms the basis of the other approaches, this is the method currently implemented in `neighborhoods`. +In the discussion section, I mention a method for reducing this number with preprocessing. + +## Discussion + +There are many options for improving the performance of the approach implemented in neighborhoods. +Each method requires preprocessing and more storage than the linear search with ray casting approach, but can improve performance when computing results for a large number of queries. + +The next optimization I would try is to divide the search space into a grid. +The goal being to reduce the number of neighborhoods that require searching while still requiring constant time lookup of such a list. + +The idea is to lay a fixed size grid on top of the map. +Next, compute the bounding box for each neighborhood (linear time but can be done while reading in the data eliminating the effective impact of this step). +Record the grid cells which overlap with the bounding box of each neighborhood. +This can be done with 4 comparisons per neighborhood, or `4n` operations. +The storage required by the resulting data structure will vary based on grid size as will the performance - smaller grids mean fewer neighborhoods overlap with each but more cells must be stored. +Thus, it would be necessary to measure the impact of this method against various grid sizes to determine the best value. + +The grid can be computed once and used to reduce the number of neighborhoods necessary to check. +Lookup will require two modulo computations, which requires `O(1)` time. + +Other methods worth exploring (after obtaining a greater understanding of Clojure) include slab decomposition, triangulation, and trapezoidal decomposition. diff --git a/doc/ray_casting_example.png b/doc/ray_casting_example.png Binary files differnew file mode 100644 index 0000000..1491c1c --- /dev/null +++ b/doc/ray_casting_example.png |