According to the U.S. Constitution, congressional districts are supposed to provide proportional representation. Practically speaking, that means that districts should be contiguous, contain roughly the same number of inhabitants, and (based on the Voting Rights Act of 1965) roughly the same racial makeup. State legislative districts are also drawn to similar standards.

State Legislatures redraw district boundaries every 10 years (after the census) to relect population changes. But due to the partisan nature of legislatures, redistricting often leads to gerrymandering - the drawing of convoluted boundaries to achieve partisan control - to consolidate control for one party.

Gerrymandering is unduly complex and can be unfair, which is why most people dislike it.

I wondered if there was a way to detect and possibly quantify gerrymandered district boundaries by considering boundary regularity. So I built a database of districts and boundary information in PostGIS from Census Bureau data which I could query.

District Irregularity Index

I wanted to find a mathematical way to compare how gerrymandered two districts might be. Note that irregular district boundaries do not necessarily imply gerrymandering. But on the other hand, gerrymandered boundaries usually appear more irregular. So let's try to define a metric that can be calculated for any geographic boundary - we'll call it the District Irregularity Index.

One way to detect irregular districts might be to compare the perimeter of a district to its area. The idea here is that a "simple" district boundary, such as a square, would have a much lower perimeter-to-area ratio than a complex shape like a 5-point star.

The figure above shows a 5-point star that fits inside a square. The star's perimeter is the same as that of the square, but its area is less than one third.

However, one problem with simply dividing the perimeter by the area is that comparing districts identical except in size will not produce an identical irregularity index. The irregularity index for a square with sides of length 10 would be ${40 \over 100} = 0.4$, compared to a square with sides of length 100 which would be ${400 \over 10000} = 0.04$. We can solve this issue by dividing by the square root of the area. So ${40 \over \sqrt{100}} = 4$, and ${400 \over \sqrt{10000}} = 4$. Finally, if we divide by 4, we get a nice round number 1 for the irregularity index of a square.

Next, I computed the District Irregularity Index for a bunch of state legislative districts. I queried for 0.25*ST_Perimeter(boundary)/SQRT(ST_Area(boundary)). (District boundaries were stored in the PostGIS database as multipolygons in a geography column called boundary.) Unsurprisingly, the simplest (most "square") districts - for example, Michigan House of Representatives District 37 and Ohio House of Representatives District 1 - scored around 1.

Michigan House of Representatives District 37

Ohio House of Representatives District 1

But on the other end of the spectrum, districts like Alaska State Senate District R and Maryland House of Delegates District 37B scored very high - not because they are gerrymandered, but because they are coastal districts that include many islands. When each coastal island gets its own polygon, the irregularity index can be high for non-gerrymandered districts.

Alaska State Senate District R

Maryland House of Delegates District 37B

To address this, I added one more term to the formula: divide by the total number of polygons in the boundary. For most land-locked districts, this will be 1, but for districts with complex coastlines, it will help reduce the irregularity index value.

So my final formula is

DistrictIrregularityIndex(boundary) = 0.25 * ST_Perimeter(boundary) / SQRT(ST_Area(boundary)) / ST_NumGeometries(boundary::geometry)

Running this query turned up some very interesting districts, including:

Maine House of Representatives District 122 (quite a gem)

Maryland Congressional District 3

Texas Congressional District 33

Louisiana State Senate District 29

Tennessee State Senate District 21

Maryland House of Delegates District 37A

Maine House of Representatives District 122

Again, just because these district shapes are irregular doesn't mean they are gerrymandered... but there is a higher likelyhood they might be.

Alabama Redistricting

In 2012, Alabama redrew its state legislative district boundaries (following the 2010 census). The Alabama Legislative Black Caucus filed suit against the state, alleging racial gerrymandering in the new district plan. After a lengthy court battle, a federal court upheld their claim and ordered the legislature to redraw 3 Senate districts and 9 House of Representatives districts. The districts were redrawn and took effect in 2017.

I looked at the Alabama state legislative districts from pre- and post-2017 redistricting, applying the District Irregularity Index to see if, in fact, the pre-2017 districts were more gerrymandered than after 2017. Here are the findings:

After the 2017 redrawing, the average district irregularity decreased almost 10% for both the house and senate. Additionally, the most irregular Alabama districts became less so by over 10%. So the metric seems reasonable.

District Irregularity Analysis

Next, I computed the irregularity index for all congressional and state legislative districts. Full results are available here, but I will present some findings below.

The most irregular Senate districts are in Virginia and Louisiana, the least irregular are in Puerto Rico and Wisconsin.

The most irregular House districts are in Alabama and Kentucky, the least irregular are in Alaska and Wisconsin.

For Congressional districts, the graph above is interesting because it has a tail made up of single-district coastal states and territories. Their average irregularity is low because of the division by number of polygons. Among the other states, Wyoming and South Dakota (both nearly-square states with only one congressional district) have low average irregularity, and Ohio and Arkansas have high average irregularity.

Ohio's congressional districts indeed appear very irregular, and are widely recognized as gerrymandered. In November 2016, 51.69% of Ohio residents voted for the Republican presidential candidate, compared to 43.56% for the Democratic candidate... but Republicans carry 12 of the 16 Congressional districts (75%).

Arkansas, on the other hand, doesn't appear terribly gerrymandered. Its high irregularity index is due to the complex boundary along the Mississippi river on the eastern side of the state.

Interestingly, my home state of Wisconsin, in which Democrats have accused Republicans of gerrymandering, has one of the lowest average irregularities for state legislative districts, and is around the middle of the pack for congressional districts. That doesn't necessarily mean Wisconsin's legislative districts aren't gerrymandered, but other factors could be at play.

Conclusion

The District Irregularity Index alone is not sufficient to accurately identify gerrymandered districts, but it can help focus the search. State legislatures might use something like the irregularity index, along with other metrics, to compare new redistricting plans to past boundaries, and ensure they are not making new districts more gerrymandered than before.

Drawing fair districts is a complex problem which is easy to criticize but hard to get right. Earlier this year, the Supreme Court declined to hear several cases involving partisan gerrymandering - due partly to the difficulty in quantifying the impact of various district boundaries. With those decisions, and no clear way to quantify the extent of partisan gerrymandering, we will likely continue to see and hear more on this controvertial topic in years to come.

Future work

There are a number of possible improvements to the District Irregularity Index. One might be to identify boundary segments near bodies of water and reduce how much those segments factor into the perimeter calculation. For example, a district boundary must necessarily be somewhat irregular if it borders a river.

With a relatively good irregularity metric, we could combine other data (like population and demographic data) to create a neutral heuristic for ranking the degree to which potential district boundaries are gerrymandered. Such a heuristic could then be used with an optimization method (such as a genetic algorithm) to automate the creation redistricting plans, and to evaluate them.

Further reading