|author||Casey Robinson <firstname.lastname@example.org>||2014-01-19 23:09:59 -0500|
|committer||Casey Robinson <email@example.com>||2014-01-19 23:30:14 -0500|
fill in some thoughts about the project in the intro
|-rw-r--r--||doc/ray_casting_example.png||bin||0 -> 40482 bytes|
2 files changed, 52 insertions, 1 deletions
diff --git a/doc/intro.md b/doc/intro.md
index 4be4bc5..fb7e56a 100644
@@ -1,3 +1,54 @@
# Introduction to neighborhoods
-TODO: write [great documentation](http://jacobian.org/writing/great-documentation/what-to-write/)
+The goal of neighborhoods is to determine the neighborhood that includes a given address.
+The address is given as a pair of GPS coordinates representing the latitude and longitude of the desired address.
+Below is a snapshot of the neighborhoods of Grand Rapids, Michigan to provide context.
+![Neighborhoods of Grand Rapids, MI](neighborhood_map.png)
+The core algorithm used by neighborhoods is the algorithm for determining if the point inside a neighborhood or not.
+Start with the point of interest.
+Draw a [semi-infinite](http://en.wikipedia.org/wiki/Semi-infinite) line (line extending to infinity in only one direction) from the point of interest to infinity along the x axis.
+Count the number of times this line intersects a line segment which is part of the border.
+If the total is even, the point is outside.
+If odd, the point is inside.
+Proof of correctness is left as an exercise to the reader.
+This technique is known as [__ray casting__](http://en.wikipedia.org/wiki/Ray_casting).
+The diagram below demonstrates the ray casting method.
+![ray casting example](ray_casting_example.png)
+Ray casting for point location requires a check of each line segment in the border.
+While we cannot improve the [asymptotic complexity](http://en.wikipedia.org/wiki/Asymptotic_computational_complexity) of the approach, we can reduce the runtime by avoiding unnecessary computation.
+Before doing any geometry we can perform a simple check, only consider line segments which have a y range covering the point of interest.
+This is only two comparisons, while the geometry requires significantly more work - one comparison, one addition, three subtractions, two multiplications, and one division.
+Once we know how to determine the relative position of points and polygons, we need to decide which neighborhoods to check.
+The most obvious case is to perform a linear search through all neighborhoods.
+The linear search will on average require checking `n/2` neighborhoods, where `n` is the total number of neighborhoods.
+Since it forms the basis of the other approaches, this is the method currently implemented in `neighborhoods`.
+In the discussion section, I mention a method for reducing this number with preprocessing.
+There are many options for improving the performance of the approach implemented in neighborhoods.
+Each method requires preprocessing and more storage than the linear search with ray casting approach, but can improve performance when computing results for a large number of queries.
+The next optimization I would try is to divide the search space into a grid.
+The goal being to reduce the number of neighborhoods that require searching while still requiring constant time lookup of such a list.
+The idea is to lay a fixed size grid on top of the map.
+Next, compute the bounding box for each neighborhood (linear time but can be done while reading in the data eliminating the effective impact of this step).
+Record the grid cells which overlap with the bounding box of each neighborhood.
+This can be done with 4 comparisons per neighborhood, or `4n` operations.
+The storage required by the resulting data structure will vary based on grid size as will the performance - smaller grids mean fewer neighborhoods overlap with each but more cells must be stored.
+Thus, it would be necessary to measure the impact of this method against various grid sizes to determine the best value.
+The grid can be computed once and used to reduce the number of neighborhoods necessary to check.
+Lookup will require two modulo computations, which requires `O(1)` time.
+Other methods worth exploring (after obtaining a greater understanding of Clojure) include slab decomposition, triangulation, and trapezoidal decomposition.
diff --git a/doc/ray_casting_example.png b/doc/ray_casting_example.pngBinary files differ
new file mode 100644