Building concave hulls (alpha shapes) with PyQt, shapely, and arcpy

In the previous post, I have shared some information about how to set up PyQt development environment for ArcGIS Pro Python developers. If you are interested in building interactive desktop applications using Python and PyQt and running them within Pro, by all means, do check the post as it will guide you through the process.

In this post, I’d like to share a sample application that I’ve put together to illustrate the process of building concave hulls (also known as alpha shapes). I’ve built a fairly simple PyQt5 application that can be run from Pro or as a standalone application (without the functionality of adding the data to the map of course). With this application, I wanted to show how easy it is to integrate a PyQt window into ArcGIS Pro, but also to show how to build the concave hulls and what kind of results you get when adjusting the alpha value.

This application can be helpful for anyone who would like to create polygons around a set of points but creating convex hulls does not produce polygons that are good enough in terms of the boundary bounds. This work is inspired by two blog posts:

Let’s do some work with the points that form an outline that looks like an “H”.

This is what you would get generating the convex hull:

convex_hull

This is what you would get generating the concave hull with a specific alpha value:

concave_hull_better

And one more concave hull with another alpha value:

concave_hull_good

The basic workflow is that you interactively remove the triangulation edges for the points that are too far from each other. Adjusting the alpha value to tune the boundaries is a tricky process because it may improve the boundaries in one place (areas 1 and 2) but degrade them in another place (areas 3, 4, and 5). Look at the figure below; we are trying to increase the alpha value to improve the figure above:

concave_hull_best

So, as you can see, it is all about finding a sweet spot where the produced results are good enough and trying to improve the boundaries will only make them worse.

The animated figure to demonstrate the process of increasing the alpha value. Removing too many edges will result in having separate disconnected polygons:

alphas

Building concave hulls using multiple alpha values can be very helpful when you have clusters of points you would like to generate separate polygons for. Because the points in these two clusters are located fairly far from each other, they don’t have any connecting triangulation edges. As the end result, you get two polygons – one polygon for each cluster of points.

two_clusters

The source code behind this application in available here:

 

Advertisements

Removing redundant polyline vertices using arcpy

If you maintain a polyline feature class of some entities such as city roads, you may have seen that some of your features have redundant vertices. These vertices are not necessary to preserve the feature’s shape. Not only they increase the size of your dataset, but they also may slow down processing time as it takes longer time to process geometries with a larger number of vertices.

You can see the redundant vertices (small green points) highlighted on a source polyline (thin green line). The cleaned shape (thick blue line) has only the essential vertices (large blue points).

collinear_image

To automate the process of removing the redundant vertices, I’ve written a Python script. The tricky part was to find out whether the points one iterates over are collinear.

Take a backup of your data before running this script as it modifies the features in place!

  1. Script works for single part as well as multi-part lines.
  2. You can choose what to do with the curves: they are either preserved without removing any redundant vertices this particular feature might have OR densified with the specified distance value. Look up the .densify() method on arcpy.Geometry object. You could alternatively use the Densify GP tool to fix your curves before running this script if you would like to experiment with the deviation values.
  3. Only polyline feature classes are supported (no points/polygons). Could be extended to support polygons though.

Static code analysis and linting for Python developers

If you have spent some time writing Python code, you may have met some of the most common error types that can occur when building a program: we may try to use an undefined variable, call a function that was not imported from another module, or access a non-existing property of a class instance. Today, I would like to explore how these types of errors could be found when programming with a very dynamic language such as Python.

Regardless of what Python IDE you use – Wing IDE, PyCharm, Visual Studio or Visual Studio Code – you may use PyLint which seems to be the most commonly used Python linter. This tool will report errors it founds in your code. Some IDEs such as PyCharm and VSCode (with Python extension) have built-in support for PyLint which means you will see the errors highlighted in your code as you type.

There are quite a few resources online (such as this, this, and this) that showcase the linting and type hinting for Python developers. In this post, I’d like to mention some tricks for arcpy developers because there are things we need to tweak. I use Wing IDE, but you need to find out whether your Python IDE supports static code analysis.

Working with arcpy objects

import arcpy
from arcpy.arcobjects import Result
fc = r’TemplateData\TemplateData.gdb\USA\states’
result = arcpy.GetCount_management(fc) # type: Result

By defining the type of the variable result, Wing IDE will be aware of the output variable type before the actual execution of the program. The type of variables can be indicated by a comment that follows it. When typing result.get, Wing gives you properties that Result object has, such as getOutput() and getMessages().

Note that if would not add the inline comment, the type of result variable would not be known to the IDE.

import arcpy
fc = r’TemplateData\TemplateData.gdb\USA\states’
result = arcpy.GetCount_management(fc) # not defining the output type

When you would type result. to inspect the variable, Wing tries to guess the type of the object, and you can see the properties of int, dict, list, and ​tuple types. In other words, Wing is sure that what it gets back as a result variable is not a set and not a ​str object, but it can be of some other data type.

This means that you might annotate your variables helping Wing analyzing your code. This will save you time because you wouldn’t need to jump to the ArcGIS Desktop Help to see what properties an object has.

Taking care of missing members in arcpy.da module

If you will run PyLint for this Python code,

import arcpy
fc = r’TemplateData\TemplateData.gdb\USA\states’
print [f[0] for f in arcpy.da.SearchCursor(fc, ‘OID@’)]

you will get this error message:

Line 3 Column 21 E1101: Module 'arcpy.da' has no 'SearchCursor' member

Not all members are exposed in arcpy​; this is applicable even to UpdateCursor​ and InsertCursor. Obviously, you would like to filter this error out. This can be done by using a PyLint configuration file. I prefer to have a pylintrc file in the folder with my Python modules with this content:

[TYPECHECK]
generated-members = arcpy.da.SearchCursor, arcpy.da.UpdateCursor

You can read more about other PyLint options here.

Type hints for Python 2.7

As most of the Python development still targets Python 2.7 (ArcGIS Desktop rather than ArcGIS Pro), you might look into how type hints can be used with Python 2.7 code. Below is a .gif animation to demonstrate how type hints can be used for a function definition or for a variable definition.

static_typing1

Dividing a polygon into a given number of equal areas with arcpy

I was recently searching the Internet trying to find any tool that would let me split an arbitrary polygon inside a geodatabase feature class into multiple polygons of equal area. ArcGIS does provide this functionality as a part of the parcel fabric functionality. Unfortunately, there is a lot of work involved in setting up the parcel fabric and there is a lot to learn before you will be able to divide your parcels. So I was looking for a simpler solution that would work directly with the geometry of a polygon.

However, I was not able to find any solution that would work and most helpful posts on the forums were pointing either at parcel fabrics or providing some ideas on the implementation of the workflow using multiple geoprocessing tools and some custom code. I found a nice ArcGIS custom script tool called Polygon Bisector which

computes a line that bisects, or divides in half, a polygon area along a line of constant latitude or longitude.

So, this would work great if you need to split a polygon creating a number that follows the exponent of two (2, 4, 8, 16, 32 and so forth). This is because after dividing a polygon into two polygons of equal area (now you have 2 polygons), you could divide each of them into two parts again (now you have 4 polygons), and so on. Since I want to be able to divide a polygon into an arbitrary number of areas, I had to write my own tool.

I have solved this problem this way. Say I want to have the polygon of area 1000 sq. m. divided into 5 equal areas:

  1. Get an extent of a polygon.
  2. Construct a polyline using the vertices of the polygon’s extent with a tiny shift of coordinates.
  3. Cut the polygon into two halves using this line.
  4. Find what is the area of the smallest polygon.
  5. If the area is smaller than the 200 sq. m. (that is, fifth part of the polygon), the shift the line again and re-run steps 2-4.
  6. If the area is 200 sq. m. or larger, than leave this part and keep working with the polygon that is left essentially running through the steps 2-5.
  7. When the original polygon has been successfully divided into equal areas, they are inserted into a new feature class along with the source polygon attributes.

The illustration of the cutting lines with the extent polygon is below.

extent_figure

This approach has several disadvantages, though.
First, if your polygon is very large and you want the parts of the polygon to have the same area with the minimal difference, the tool execution will take a lot of time as you will need to shift the cutting lines a few centimeters, cut the polygon into halves, and evaluate the result.
Second, you can only choose between the North-South or West-East direction of the cutting lines. You won’t be able to specify the angle yourself.

However, this tool works great for the target use I kept in mind when writing it. Using 0.5 meters as the step for moving the cutting line, the difference between the largest and the smallest sub-polygons was just around 1%. Running the same code for the same polygon with the shift value set to 0.05 meters (5 cm), I have observed the difference value to be around 0.1%.

The illustration of the polygon subdivision is below (West-East to the left, North-South to the right).

divide_polys_figure

The code is available at the GitHub Gist as usual. Run this code inside the Python window in ArcMap while having a single polygon selected.

 

 

Generate HTML report about Esri geodatabase using Python

Many GIS administrators and analysts maintain an inventory of all the data and metadata associated with a corporate Esri geodatabase. Most of the users have written short Python scripts that reads properties of geodatabase items and prints them to a text file or an Excel sheet.

A non-supported product called ArcGIS Diagrammer was able to generate HTML reports about your geodatabase reporting information about its domains, feature classes, tables and so much more. However, since version ArcGIS 10.3, the support for the tools was dropped, so if you want to use it now you need either to have ArcGIS Desktop 10.2 installed or have a bunch of .dll files from the ArcObjects .NET SDK 10.2 stored on your machine. This makes using ArcGIS Diagrammer cumbersome.

Anyways, ArcGIS Diagrammer wasn’t good enough for my needs because I have always wanted to work with an interactive representation (where I could sort columns, filter irrelevant information, and create printable views). Being frustrated with the need to hack own scripts for this, I’ve decided to write an own Python package that would report information about Esri geodatabase in an interactive HTML page.

Spending some evenings now and then, and that’s it – the package is in beta now, ready to install and use! Make sure to get it from its GitHub repository: registrant. This is how a sample report could look like; below is an animated example just to give you a feeling of what it would be like to work with the report.

report_sample

I would love to hear from users whether it’s useful and what other information and features such a report should have. You can submit an issue on the registrant‘s GitHub page or just leave a comment to this post.

 

Convert an Esri geodatabase feature class to GeoJSON

I bet many of you using Esri geodatabase datasets needed at some point to convert your data to some other formats for interoperability. GeoJSON is one of those formats that can be understood by many other systems and APIs.

If you have ArcGIS Desktop 10.5+, you can use Features To JSON geoprocessing tool as in version 10.5, Esri has added optional geoJSON parameter. Please mind that this tool didn’t have this parameter in previous versions.

If you are still on ArcGIS Desktop 10.4 or an earlier version, you could use the private interface of arcpy.Geometry objects that can expose some of the information about the geometries in the form of geoJSON and then do the rest on your own using json module:

g = arcpy.PointGeometry(arcpy.Point(45, 45), arcpy.SpatialReference(4326))
g.__geo_interface__
{'coordinates': (45.0, 45.0), 'type': 'Point'}

If you have ArcGIS Pro installed, then you can use the Features To JSON geoprocessing tool which has always had this optional geoJSON parameter which will do the trick.

If you need to do a conversion on a machine without ArcGIS Desktop, you could use ogr2ogr command (which will convert the Counties shapefile into a .json file in geoJSON format):

.\ogr2ogr.exe -f “GeoJSON” “C:\GIS\output.json” “C:\GIS\Temp\Counties.shp” -select NAME,POP2000

You could also use some other conversion tools such as ogr Python module:

What I really like about geoJSON is that it’s possible to draw its features online very easily such as using http://geojson.io (you can even edit, style and share your geoJSON using this service!) or on GitHub (they have pretty good support for geoJSON). On GutHub, you can even show the geoJSON features on a basemap directly in the source code or in a gist. Look at this example!

 

 

 

Network Analyst and ArcPy: finding k-alternate path

Using ArcGIS Network Analyst extension, it is possible to find out the best route between a pair of points (best is defined in terms of the impedance — it can the shortest or the fastest route, for instance). However, you might like having multiple alternatives to the best route which one usually refers to as the K shortest paths.

The K shortest path routing algorithm is

an extension algorithm of the shortest path routing algorithm in a given network.

When solving a route between a pair of points, Network Analyst finds only a single route that is considered to be the best. That is, one cannot get a number of routes while solving a Route layer. However, this is something many of you have seen when routing using Google Maps or HERE, for instance.

I got curious and have started searching for this algorithm implementation using ArcGIS Network Analyst. I’ve found a very useful Esri forums post by a former Esri Network Analyst product engineer Michael Rice who wrote that

At the ArcGIS 10 release, you can approximate a k-shortest paths solver using the enhanced barrier features. Additionally, for line and polygon barriers, instead of just restricting the parts of the network that are covered by their geometry, you can opt to simply scale the costs of the parts of the network covered by the geometry.

Therefore, to approximate k-shortest paths, you can do the following:

1. Solve the route to get the best path
2. Take the current best path and load it as a polyline barrier with a scale factor of x
3. Repeat

Note that by scaling the polyline barrier instead of restricting it, this will allow
the next path to reuse some of the same edges along the previous path if necessary, but at a higher cost than in previous paths. Also note that when multiple line barriers apply to the same edge, the final scale factor for the cost of that edge becomes the product of their individual scale factors. For example, if there are two line barriers applied to some edge with scale factors x and y, respectively, the scaled cost of traversing that edge will be x*y*originalEdgeCost.

This means that the more an edge is reused across different paths, the less likely it will be to be used in any subsequent paths as the process discussed above continues to iterate.

This approach will technically not guarantee a true k-shortest path solve, so we must
be careful not to incorrectly label it as such. As I said previously, it is merely an
approximation for this concept. A more appropriate name might be to call this a
k-alternate path approach (as opposed to shortest). The alternate paths you get will
ultimately depend on the scale factors you use.

I’ve written a Python add-in that given a number of alternative routes to find, will generate a new feature layer with a number of routes solved. I have used the workflow Michael described loading a solved route as a polyline barrier for all subsequent solve operations. You can use this code when you want to generate multiple alternate routes. This is how it looks in ArcMap.

2017-04-27 15_12_00-Routing.mxd - ArcMap

The Python code to be used to make the add-in: