A Point about Polygons

[An article by Bob Stein in the March '97 issue of Linux Journal. Approx. 2500 words, 2 tables, 2 listings, 4 figures.]

[See also the text version with no tables, listings or figures. Listings 1 and 2 are available separately.]

Several algorithms exist in the public domain for web servers to determine whether a point is inside a polygon. They are used in the implementation of "image maps," both of the traditional server-side variety as well as those of the more modern client-side. So who needs one more? Well, the bone this author wishes respectfully to pick is that most of the point-in-polygon code he could find is woefully overcomplicated. Being a lover of simplicity and simplification, he just could not leave well enough alone.

The resulting C-language routine has just three if-statements and no divides. Contrast that with three divides and ten if-statements in the corresponding routine that's part of the popular Apache web server. What a warthog. You gotta see it to believe it. Actually, you can see part of it on the 7/96 cover of WEBsmith magazine, underneath the peanuts. Get the Apache distribution and search for pointinpoly to see the whole sausage-works. The routine from CERN/W3C's httpd is even worse, weighting in at 19 if-statements! Search for inside_poly in their HTImage.c. (URL's are shown in table 2.)

The following table contrasts five different routines in the public domain for finding out if a point is in a polygon. In all cases, the polygon is specified as an array of X,Y coordinates of the corner points.

Table 1: Comparison of Point-in-Polygon Algorithms

Galacticomm's
inpoly()
Apache's
pointinpoly()
CERN/W3C's
inside_poly()
GE CRD MTL's
inPolygon{}
Woods Hole's
inside()
Authors: * Bob Stein
Craig Yap
Eric Haines
Rob McCool
Ari Luotonen Kevin Kenny Rich Pawlowicz
Language: C C C Tcl Matlab
Data type:

unsigned int

double

int

(native)

(native)

Computation type:

long

double

float, double

(native)

(native)

Method:

crossings

crossings

crossings

crossings

angle sum

Lines of code:

44

72

91

29

89

Loops:

1

4

1

1

0?

If-statements:

3

10

19

2

5

Else-statements:

1

4

11

1

1

Multiplies:

2

3

1

1

20

Divides:

0

3

1

1

2

Arctangents:

0

0

0

0

1

* My apologies if I've gotten any of the authorship wrong. It wasn't always clear who did what.

This is a pretty casual analysis of the algorithms. I certainly didn't shrink from showing my inpoly() in a good light. For example, && and || operators in C are often if-statements in disguise. I used one of these operators (as did most of the other folks) but that doesn't show up in the table at all. Also, some folks' line counts are inflated slightly by comments and blank lines. But you get a rough idea.

Here are some URL's for the source code on the web:

Table 2: Sources of public domain point-in-polygon algorithms

Galacticomm's inpoly() http://www.visibone.com/inpoly/inpoly.c.txt
Apache's pointinpoly() http://www.apache.org/dist/
(once you un-gzip the monster source file, search for pointinpoly)
CERN/W3C's inside_poly() http://www.w3.org/pub/WWW/Daemon/Implementation/HTImage.c
(about 70% of the way down you'll find inside_poly())
GE CRD MTL's inPolygon{} ftp://ce-toolkit.crd.ge.com/pub/tcl/tcl-www.tar.gz
(once you un-gzip, search for inPolygon.tcl)
Woods Hole Oceanographic's
inside()
ftp://acoustics.whoi.edu/public/Matlab/oceans/inside.m

A Note of Reason

All judgments have a context, and I should explain mine. The prime prerogative in this article is algorithmic simplicity. This, I confess, has very little to do with the practical needs of the web. In case I've gotten ahead of myself, a web image map is how you carve up an image so clicking in one particular region does one thing, clicking somewhere else does something else. Web image maps are such a tiny fraction of the work of web servers and browsers that all of the above routines are just fine as they are. Changing from one to another is not going to make any noticeable difference in web performance. And once we're sure it works, who's going to look at the code again for 100 years? Thus I don't have any practical considerations of performance or readability to stiffen my cause. I'm simply championing the aesthetics of simplicity.

The point this article wishes to make is that problems are not always what they seem. Sometimes a simple solution exists, but you've got to take a hard look to find it. My buddy Craig had started out by porting inside_poly() from W3C, I think it was, for use in our web server. When I saw all the floating point math and if-statement-special-cases I thought there just has to be a better way. So Craig and I started from scratch, wrestled the problem to the ground, and came up with a solution containing no floating point math, which is silly for screen pixels, and no math more tedious than a multiply. Oh, and we got rid of all the pesky special cases too, except for one: polygons with fewer than 3 sides are excluded. What could be inside a 2-sided polygon? (Apache's pointinpoly() doesn't even check, probably making a big mess with a 1-point polygon, but hey, it's not like us to gloat or anything.)

Now the stated goal is simplicity, not performance, but I did stray from that course on one issue: avoiding divides. Again, performance hardly matters a hoot for image map applications, but maybe one day someone might use this algorithm for some kind of 3D hidden surface algorithm, or something. Getting rid of the divide may have, in effect, required me to use an additional if-statement. (Remember how a greater-than expression flips if you multiply both sides by a negative number? As in a < b means the same as -a > -b? I think this might be relevant somehow but I have no idea for sure.) Anyhow, what all this is leading up to is that Kevin Kenny's algorithm at 29 lines and 2 if-statements is by far the shortest and simplest. But mine is still better in some tortured sense because it doesn't need a divide and his does. Sorry, Kevin, write your own article.

Now let's discuss the more popular algorithm for determining whether a point is inside a polygon.

The "Crossing Count" Algorithm

Imagine you could detect whether a point was in a polygon or not by placing a friendly trained snail at the point and telling him to head for the north pole. (We're only concerned with image maps, so we exclude polygons that extend to the north pole, and we ignore Coriolis forces.) You'd equip our intrepid friend with a snail-sized clipboard and instruct him to tick off each time he crossed an edge of the polygon. He'd call you from the north pole and report the number of crossings. An even number (including zero) means he started outside the polygon, odd means inside.

This reduces the problem to detecting whether or not line segments intersect. It's even a little better than that, because one of the line segments is simply the positive Y axis. To make that leap, just declare the snail's staring point to be the origin, (0,0), and translate all of the polygon corners so they're relative to that point.

We'll go into the algorithm a little later, but take a look at the finished code. The very picture of simplicity, right? If you haven't checked out the other versions, you really oughta (except Kevin's).

Listing of INPOLY.C

int                                /*   1=inside, 0=outside                */
inpoly(                            /* is target point inside a 2D polygon? */
unsigned int poly[][2],            /*   polygon points, [0]=x, [1]=y       */
int npoints,                       /*   number of points in polygon        */
unsigned int xt,                   /*   x (horizontal) of target point     */
unsigned int yt)                   /*   y (vertical) of target point       */
{
     unsigned int xnew,ynew;
     unsigned int xold,yold;
     unsigned int x1,y1;
     unsigned int x2,y2;
     int i;
     int inside=0;

     if (npoints < 3) {
          return(0);
     }
     xold=poly[npoints-1][0];
     yold=poly[npoints-1][1];
     for (i=0 ; i < npoints ; i++) {
          xnew=poly[i][0];
          ynew=poly[i][1];
          if (xnew > xold) {
               x1=xold;
               x2=xnew;
               y1=yold;
               y2=ynew;
          }
          else {
               x1=xnew;
               x2=xold;
               y1=ynew;
               y2=yold;
          }
          if ((xnew < xt) == (xt <= xold)          /* edge "open" at one end */
           && ((long)yt-(long)y1)*(long)(x2-x1)
            < ((long)y2-(long)y1)*(long)(xt-x1)) {
               inside=!inside;
          }
          xold=xnew;
          yold=ynew;
     }
     return(inside);
}

Testing the Algorithm

The test program draws a random 40-sided polygon and then picks random points to throw at the inpoly() routine. Points the routine says are inside the polygon it draws red, points outside are blue.

See http://www.visibone.com/inpoly/testpoly.c.txt for the source code to this test program.

All tripped up

Our first rendition of inpoly() had a subtle flaw that the test program made evident. (The full story contains an embarrassing lesson. "It'll work," we sneered, "We don't need to waste time on a full graphical test. Besides, it'd be too much fun." After we found out that our image maps had leaks, we wrote the test program.) Here's a close-up of the flaw:

Along a vertical line, all the colors are wrong. The flaw turned out to be that when our mindless mollusk crosses the bottom corner, the little bugger was counting the crossing of both edges! After that, he was always exactly wrong -- he thought he was in when he was out and he thought he was out when he was in. The solution has to have the property that whenever our esteemed escargot crosses into the polygon at a corner, we make sure he counts exactly one crossing. Two's no good, and in fact, zero is just as bad, one is what we need. (And in case you're really paying attention, the reason why the flaw in the close-up extends up from the corner is that the positive Y axis extends downward in screen coordinates.)

I suspect this is a problem unique to the fixed-point world. I'm sure my fellow point-in-polygon-smiths have either lucked out or dealt with it somehow, at least I'd like to think so. (A lie-detector would peg on that one. This article would be insufferably smug if I had found leaky corners in any of the other algorithms.) In my case, I realized that I could not blindly count all crossings of the end point of each of the edges as a crossing. My first thought was to associate each end point with one and only one edge. This sounds fair and equitable, but like many things fitting that description it just plain won't work. A problem turns up when Agent Snail just lightly nicks the corner of a polygon he's not inside at all. That's counted as one crossing, hence the snail report is bunk.

Since your humble author abhors special cases, he sought something that would work in all cases. A digression about special cases. In one of my first programming assignments, we were instructed to convert a dollar and cent amount into a quantity of U.S. coins: 27 cents meant 1 quarter and 2 pennies, etc. A classmate had started diligently coding a special case for every possible amount between 0 and $1.00. If 37 then 1Q, 1D, 2P. If 38 then 1Q, 1D, 3P, ad nauseam. If you're programming and think you have a special case, look again. Most so-called special cases don't really merit special treatment at all. My wise friend Chris Robert has a corollary for life: many apparently extenuating circumstances are not. We humans are skilled at rationalizing the same dumb mistakes, over and over again, from cradle to grave. Moral: pay attention to what you're doing and simplify your life. This's only slightly diluted by the fact that inventing a new point-in-polygon algorithm has no practical benefit whatsoever.

Ok, I've gotten a bit off the subject. Turns out there is a way to handle the corners without special cases.

Cutting Corners

The scheme for getting our faithful friend to count corner crossings correctly is always to count crossing the right end of each edge, but never the left end (right meaning positive X). In the top drawing on the left, the black circles represent points our snail will count if he crosses, the white circles he won't. When you put the polygon together, everything ends up the way we want. Nicking the corner means he counts either 0 crossings or 2 crossings. We don't care which, both are even and our snail knows he's outside. The circles with 1's in them represent points counted once if the snail crosses them. This is fine, just like crossing the nearby sides.

It's time to analyze the guts of the inpoly() routine:

      if ((xnew < xt) == (xt <= xold)              /* straddle test */
       && ((long)yt-(long)y1)*(long)(x2-x1)        /* north         */
        < ((long)y2-(long)y1)*(long)(xt-x1)) {     /*       test    */
           inside=!inside;
      }

This represents a slight modification of the snail's instructions. He plays a bit flipping "she loves me, she loves me not" kind of game rather than counting up the crossings and then reporting whether the total is even or odd. He starts out assuming he's outside, and complements that assumption with each crossing. So much for the inside=!inside statement.

This if-test happens inside a for-loop that considers all of the edges of the polygon, one at a time. Each edge is a line segment that stretches between the corners (xold,yold) and (xnew,ynew). We've arranged it so that (x1,y1) and (x2,y2) also represent the same edge, but the points are swapped if necessary to make it so that x1 <= x2.

Now two things have to be true for our ever-meticulous snail to count that he crossed this edge. First, the segment must straddle the Y axis (where the right end is counted but the left one is not). Second, that straddling has to happen to the north of the snail's starting point. These are exactly the questions determined by the if-statement's two pieces, on either side of the &&.

Now that first expression is a sneaky one, and I confess I might have preferred the less opaque code (x1 < xt && xt <= x2). You can see that it does the same thing if you look carefully (very carefully, I was fooled for a while there). But I hate to fix something unless I've already broken it, if you know what I mean.

That north computation is the one I'm all proud about because none of my esteemed fellow polygonsmiths made one that doesn't need a divide. It does depend on the knowledge that (x2-x1) is positive. Other than that, it's just a transmogrification of that famous y=mx+b equation from high school algebra.

By the way, we've left out the case where an edge line segment stands straight up and down above the snail touchdown point. Such an edge would never be counted by Mr. Snail at all! That's because the == test would always be false (because xnew, xt, and xold are all the same value). What's really wild is that's just what we want. In a sense, he's crossing three edges when we only want to count one. It turns out that the adjacent line segments are all we're interested in knowing if our snail crossed, and the rules already discussed work perfectly for them.

Life on the edge

By the way, who cares in an image map whether the points along the edge of a polygon are technically inside or outside? As you can see in the close-up, some of the originally white pixels (representing the polygon edge) turned to red, others to blue. If a browsing user clicks on the edge of a region, he may get in, he may not. But being one pixel off is usually not an issue if your screen resolution is greater than 100 x 100. In the inpoly() routine, some edges are in, some are out. (I don't mind admitting to a crime after convincing everyone it deserves no punishment.)

Angling for Adders

I haven't discussed the angle-sum method used by Woods Hole Oceanographic Institution for their algorithm, written in "Matlab," best I can tell. The algorithm needs to compute arctangents, so it's mostly just a laboratory curiosity (beggin' your pardon, Rich). The idea is that you add up the angles subtended by lines drawn from the target point to each of the corners of the polygon. If the sum is an even multiple of 360 degrees, your out, odd you're in. Vaguely familiar? Here's the analogy: You're in a pitch-black room with a very very long snake all over the floor. This is a particularly rare variety of deep-sea snake (Woods Hole knows all about them) with glow-in-the-dark dots every foot or so. Oh and he reacts to light by instantly constricting in an iron grip of death. Your question is whether you're standing inside the maze of coils at your feet or outside. You'd like to know before you turn on the light because he gets very annoyed if you step on him.

Face the head of the snake and visually trace his entire body, noting as you do how your feet turn (somehow, it's a stretch I know). When you're done, face the head again. Now, if you didn't have to turn around at all, you're safely outside the snake. If you turned around twice in either direction things are fine too. Four times, and you're still OK. If you turned around an odd number of times in either direction, you're meat.

No wonder folks tend to use the crossing-count algorithm.


Bob Stein was a writer/developer at Galacticomm for nine years. He developed the web server that's part of Galacticomm's Worldgroup software which, of course, uses his inpoly.c for image map polygons. He is now an independent consultant specializing in hyper-useful user interfaces.

Bob Stein, stein@visibone.com.