SQL Server spatial functions for GIS users: part 2

I was curious to see what posts of my blog have been visited by most people over the time. The most popular post is SQL Server spatial functions for GIS users which I have published 3 years ago. It seems as there are many GIS users who need to use SQL Server spatial functions as similar requests such as “sql server spatial functions” are appearing fairly often in the search engines. Today, I will share more examples of using SQL Server spatial functions if you are a GIS user.

Below you find just a few sample SQL queries that have been executed against a SQL Server database containing datasets accessible via PostGIS workshop materials by Boundless which can be downloaded from this link.

This post is an extraction of the Jupyter notebook published on GitHub. Open the notebook to see the result tables or jump to the end of the post where I have it published.

Find points that have the same coordinates

To find coincident points (often referred to as duplicates), you could use various SQL Server spatial functions.

Table1.Shape.STDistance(Table2.Shape) < 1000 --distance value
Table1.Shape.STEquals(Table2.Shape) = 1 --whether shapes are identical
Table1.SHAPE.STX and Table1.SHAPE.STY --compare points XY coordinates

All of these methods would give the same result while having a different execution performance.

Find using STDistance

--Find duplicate points within a certain borough
SELECT CAST(T1.ID AS INT) AS FirstPoint,
CAST(T2.ID AS INT) SecondPoint,
T1.Shape.STDistance(T2.Shape) Distance
FROM
dbo.Homicides T1
JOIN
dbo.Homicides T2
ON
T1.ID < T2.ID
and
T1.Shape.STDistance(T2.Shape) = 0
and
T1.BORONAME = 'Queens'
ORDER BY
Distance, FirstPoint

Find using STDistance

SELECT CAST(T1.ID AS INT) AS FirstPointId,
CAST(T2.ID AS INT) SecondPointId,
T1.Shape.STDistance(T2.Shape) Distance
FROM
dbo.Homicides T1
JOIN
dbo.Homicides T2
ON
T1.ID < T2.ID
and
T1.Shape.STEquals(T2.Shape) = 1
and
T1.BORONAME = 'Queens'
ORDER BY
FirstPointId, SecondPointId

Find using STEquals

SELECT CAST(T1.ID AS INT) AS FirstPointId,
CAST(T2.ID AS INT) SecondPointId,
T1.Shape.STDistance(T2.Shape) Distance
FROM
dbo.Homicides T1
JOIN
dbo.Homicides T2
ON
T1.ID < T2.ID
and
T1.Shape.STEquals(T2.Shape) = 1
and
T1.BORONAME = 'Queens'
ORDER BY
FirstPointId, SecondPointId

Find using STX and STY

SELECT T1.SHAPE.STX, T1.SHAPE.STY, COUNT(*) AS COUNT
FROM
dbo.Homicides T1
WHERE
T1.BORONAME = 'Queens'
GROUP BY
T1.SHAPE.STX, T1.SHAPE.STY
HAVING
COUNT(*) > 1

Calculating distances between points stored in tables

The STDistance function could be used to find the plain distance between the points stored within the same table.

SELECT CAST(T1.ID AS INT) AS FirstPointId
,CAST(T2.ID AS INT) SecondPointId
,T1.Shape.STDistance(T2.Shape) Distance
FROM
dbo.Homicides T1
JOIN dbo.Homicides T2
ON T1.ID < T2.ID
and
T1.Shape.STDistance(T2.Shape) BETWEEN 10 AND 20
and
T1.BORONAME = 'Queens'
ORDER BY
Distance

Calculating distances between points stored in the same table

SELECT CAST(T1.ID AS INT) AS FirstPointId
,CAST(T2.ID AS INT) SecondPointId
,T1.Shape.STDistance(T2.Shape) Distance
FROM
dbo.Homicides T1
JOIN dbo.Homicides T2
ON T1.ID < T2.ID
and
T1.Shape.STDistance(T2.Shape) BETWEEN 10 AND 20
and
T1.BORONAME = 'Queens'
ORDER BY
Distance

Calculating distances between points stored in two tables

Find homicide points that are located within the specified number of meters to the subway stations points. ArcGIS tool: Point Distance (Analysis)

SELECT * FROM
(SELECT CAST(Homi.ID AS int) AS HomicideId
,Homi.WEAPON AS Weapon
,CAST(Subway.ID AS int) AS SubwayId
,Subway.NAME AS Name
,Homi.Shape.STDistance(Subway.Shape) AS Distance
FROM dbo.HOMICIDES Homi
cross join dbo.Subway_stations Subway
where Homi.BORONAME = 'Manhattan' AND Subway.BOROUGH = 'Manhattan')
AS
data
WHERE
Distance < 20
ORDER BY
Distance

Counting points in polygons

To count points in polygons, you could use the STContains function. Table1.Shape.STContains(Table2.Shape) would return 0 or 1.

Find neighborhoods with the largest number of crimes committed (count number of homicides in each neighborhood)

SELECT TOP 10
Polys.Name AS NeighborhoodName, Count(*) AS CrimeCount
FROM
dbo.Homicides AS Points
JOIN
dbo.Neighborhoods AS Polys
ON
Polys.Shape.STContains(Points.Shape) = 1
GROUP BY
Polys.Name
ORDER BY
CrimeCount DESC

We can also calculate a column in the Neighborhoods table to contain the number of points within each neighborhood. For that, we will first need to add a new column to the table, then populate it, and then drop to leave the data clean for the further queries.

This query adding a new fields and calculating the number of points located within each polygon is what is done by the ArcGIS GP tool Spatial Join.

ALTER TABLE dbo.Neighborhoods
ADD PointCount int;

UPDATE
Polys
SET
[PointCount] = COUNTS.CrimeCount
FROM
dbo.Neighborhoods AS Polys
JOIN
(
SELECT
Polys.Name AS NeighborhoodName, Count(*) AS CrimeCount
FROM
dbo.Homicides AS Points
JOIN
dbo.Neighborhoods AS Polys
ON
Polys.Shape.STContains(Points.Shape) = 1
GROUP BY
Polys.Name
) AS COUNTS
ON
Polys.Name = COUNTS.NeighborhoodName

Add polygon name to points located within the polygon

To enrich the points layer with the information what polygon each point is located within you would need to use the STWithin function. In this example, we will add a new column to the homicides table so we know what neighborhood the crime has been committed.

Again, this query adding a new field and calculating the neighborhood name for the points located within each polygon is what is done by the ArcGIS GP tool Spatial Join.

SELECT TOP 10
Points.OBJECTID
,Points.INCIDENT_D
,Points.BORONAME
,Points.NUM_VICTIM
,Points.PRIMARY_MO
,Points.ID
,Points.WEAPON
,Points.LIGHT_DARK
,Points.YEAR
,Polys.Name AS NeighborhoodName
FROM
dbo.Neighborhoods AS Polys
JOIN
dbo.Homicides AS Points
ON
Points.Shape.STWithin(Polys.Shape) = 1
ORDER BY
OBJECTID
ALTER TABLE dbo.Homicides
ADD NeighborhoodName varchar(50);

UPDATE
Points
SET
[NeighborhoodName] = PointsInPolys.NeighborhoodName
FROM
dbo.Homicides AS Points
JOIN
(
SELECT
Points.OBJECTID
,Points.INCIDENT_D
,Points.BORONAME
,Points.NUM_VICTIM
,Points.PRIMARY_MO
,Points.ID
,Points.WEAPON
,Points.LIGHT_DARK
,Points.YEAR
,Polys.Name AS NeighborhoodName
FROM
dbo.Neighborhoods AS Polys
JOIN
dbo.Homicides AS Points
ON
Points.Shape.STWithin(Polys.Shape) = 1
) AS PointsInPolys
ON
PointsInPolys.ID = Points.ID

Summary statistics and frequency

This is a simplistic implementation of the Frequency GP tool in ArcGIS using a T-SQL stored procedure.

ALTER PROCEDURE dbo.FrequencyTable
@Columns varchar(500)
AS
BEGIN
EXEC ('SELECT COUNT(*) AS ' + @Columns +
' FROM dbo.HOMICIDES
WHERE WEAPON <> '''' AND LIGHT_DARK <> ''''
GROUP BY WEAPON,
LIGHT_DARK ORDER BY FREQUENCY DESC;');
END
GO
EXEC dbo.FrequencyTable 'dbo.HOMICIDES', 'FREQUENCY, WEAPON, LIGHT_DARK';

The Jupyter notebook with the result sets:

 

Advertisements

Developing Python GUI in ArcGIS Pro with PyQt

Many ArcGIS analysts and developers working with Python tools in ArcGIS Desktop or ArcGIS Pro would like to be able to build custom Python script tools extending the built-in framework of the GP tools dialog. I have answered a question Recommendations about graphic interface for ArcPy/Python script providing useful links on how to build an own GUI inside ArcGIS desktop application. I have also blogged about this topic in another post – Building custom UI tools for ArcGIS with Python.

Other users have also shown some progress. I have seen that Erik Pimpler has a couple of videos (Building GUI Applications for ArcGIS Desktop with Python – Part 1, Building GUI Applications for ArcGIS Desktop with Python – Part 2) that showcase how one can build custom GUI inside ArcMap using wxPython – a Python framework for GUI development. I have also seen Andrew Chapkowski integrating Tkinter into a Python add-in, but this is basically it. However, I do see that many users would like to be able to build own GUI inside ArcGIS using Python and/or using a designer or form builder to facilitate the process of the layout preparation. There is even an idea Form Builder for Python Tools on the ArcGIS Ideas website with quite a few votes.

When building a GUI app with Python, my personal choice is PyQt. It’s extremely mature and robust framework for GUI development which is a pleasure to work with. I have built some GUI utilities for text editing (think some of the Linux grep functionality in a Windows app) as well as SQL editor to work with file geodatabases. I have been incredibly productive building applications really quickly. There are so many useful tutorials on PyQt, so I have never really have stuck at any problem. You can build a helpful utility application in one evening.

Resources on PyQt development:

QGIS developers are familiar with this framework because they build QGIS plugins using PyQt, however if you are an ArcGIS developer, this is pretty new to you unless you have used PyQt for other projects. So, let’s see how you can set up an PyQt application with ArcGIS Pro.

  1. Install ArcGIS Pro (further just Pro). You will get a default conda environment called arcgispro-py3 which you can copy first into a new environment so you could restore it should something bad happen with the default one which we will be playing with.
  2. With the help of Python Package Manager in Pro, you need to install pyqt 5.6.0 into the default environment. This will quite take some time (5-10 mins).
  3. Now you are ready to write a PyQt application that can be called from a Python script tool.
  4. Run Pro, create and then save a new empty project.
  5. Create a new custom script tool in a toolbox and point it to a Python file with the following content.

Now you can run the script tool just like you would any other custom script tool and you should get this tiny app window open:

2017-11-25 16_45_14-Basic App

Note: if you are getting an error message trying to run the tool:

This application failed to start because it could not find or load the Qt platform plugin “windows”.

then you need to add a system environment variable QT_QPA_PLATFORM_PLUGIN_PATH with the path to the plugins folder used by PyQt5. If you have installed ArcGIS Pro into a default location for all users, you would need to provide this value: C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3\Library\plugins\platforms. Restart ArcGIS Pro to let it grab the environment variable on the next start.

If you did get this window – congratulations, you have just accessed arcpy from a Python program that was run as a PyQt5 application. You have accessed the currently open project in Pro and accessed its property. So, what does this imply?

This means you can write code within a PyQt5 application that will be accessing your current map and its layers as well as basically everything you can reach with arcpy. Now it is up to you to choose what GUI to build as Qt framework provides excellent set of widgets to choose from.

Note: you may have noticed that after you have run the script tool and the PyQt app window was opened, Pro thought that the script tool is still running, so the script tool will complete only upon closing the application. This behavior can be changed if you run your PyQt application in own thread, so as your application starts, the script tool will complete its execution. This can be done using the threading module.

Create a new Python Python module to use as the source code for the script tool:

This is what you would see.

2017-11-25 17_02_54-Basic App

Note: since the application runs in its own thread, it won’t be able to access the current project: it will run as if you were running it as a standalone script. You can of course still use arcpy and other packages you have in your Pro conda environment.

To give you a feeling of how powerful PyQt is and what kind of amazing tools you can build to let Pro users to work with, I’ve created a sample app that can showcase some of the PyQt power. In this application, you can choose an input point feature class and it will generate something called alpha shape (or concave hull). I’ve adopted some of the code you can find in the blog posts Drawing Boundaries In Python by HumanGeo and The fading shape of alpha by Sean Gillies.

In the application, you can choose what edges of the triangulation you would like to keep based on the distance between the edge vertices by using an interactive slider. To visualize the produced data, embedded version of Matplotlib plot is drawn. When you are happy with the result, you can add the produced data to ArcGIS Pro current map as layers. I will post the source code of this application in another post later on.

Below is the screen recording of running the Pro application and starting a custom script tool that in turn starts a PyQt application window.

alphas.gif

Removing redundant polyline vertices using arcpy

If you maintain a polyline feature class of some entities such as city roads, you may have seen that some of your features have redundant vertices. These vertices are not necessary to preserve the feature’s shape. Not only they increase the size of your dataset, but they also may slow down processing time as it takes longer time to process geometries with a larger number of vertices.

You can see the redundant vertices (small green points) highlighted on a source polyline (thin green line). The cleaned shape (thick blue line) has only the essential vertices (large blue points).

collinear_image

To automate the process of removing the redundant vertices, I’ve written a Python script. The tricky part was to find out whether the points one iterates over are collinear.

Take a backup of your data before running this script as it modifies the features in place!

  1. Script works for single part as well as multi-part lines.
  2. You can choose what to do with the curves: they are either preserved without removing any redundant vertices this particular feature might have OR densified with the specified distance value. Look up the .densify() method on arcpy.Geometry object. You could alternatively use the Densify GP tool to fix your curves before running this script if you would like to experiment with the deviation values.
  3. Only polyline feature classes are supported (no points/polygons). Could be extended to support polygons though.

Static code analysis and linting for Python developers

If you have spent some time writing Python code, you may have met some of the most common error types that can occur when building a program: we may try to use an undefined variable, call a function that was not imported from another module, or access a non-existing property of a class instance. Today, I would like to explore how these types of errors could be found when programming with a very dynamic language such as Python.

Regardless of what Python IDE you use – Wing IDE, PyCharm, Visual Studio or Visual Studio Code – you may use PyLint which seems to be the most commonly used Python linter. This tool will report errors it founds in your code. Some IDEs such as PyCharm and VSCode (with Python extension) have built-in support for PyLint which means you will see the errors highlighted in your code as you type.

There are quite a few resources online (such as this, this, and this) that showcase the linting and type hinting for Python developers. In this post, I’d like to mention some tricks for arcpy developers because there are things we need to tweak. I use Wing IDE, but you need to find out whether your Python IDE supports static code analysis.

Working with arcpy objects

import arcpy
from arcpy.arcobjects import Result
fc = r’TemplateData\TemplateData.gdb\USA\states’
result = arcpy.GetCount_management(fc) # type: Result

By defining the type of the variable result, Wing IDE will be aware of the output variable type before the actual execution of the program. The type of variables can be indicated by a comment that follows it. When typing result.get, Wing gives you properties that Result object has, such as getOutput() and getMessages().

Note that if would not add the inline comment, the type of result variable would not be known to the IDE.

import arcpy
fc = r’TemplateData\TemplateData.gdb\USA\states’
result = arcpy.GetCount_management(fc) # not defining the output type

When you would type result. to inspect the variable, Wing tries to guess the type of the object, and you can see the properties of int, dict, list, and ​tuple types. In other words, Wing is sure that what it gets back as a result variable is not a set and not a ​str object, but it can be of some other data type.

This means that you might annotate your variables helping Wing analyzing your code. This will save you time because you wouldn’t need to jump to the ArcGIS Desktop Help to see what properties an object has.

Taking care of missing members in arcpy.da module

If you will run PyLint for this Python code,

import arcpy
fc = r’TemplateData\TemplateData.gdb\USA\states’
print [f[0] for f in arcpy.da.SearchCursor(fc, ‘OID@’)]

you will get this error message:

Line 3 Column 21 E1101: Module 'arcpy.da' has no 'SearchCursor' member

Not all members are exposed in arcpy​; this is applicable even to UpdateCursor​ and InsertCursor. Obviously, you would like to filter this error out. This can be done by using a PyLint configuration file. I prefer to have a pylintrc file in the folder with my Python modules with this content:

[TYPECHECK]
generated-members = arcpy.da.SearchCursor, arcpy.da.UpdateCursor

You can read more about other PyLint options here.

Type hints for Python 2.7

As most of the Python development still targets Python 2.7 (ArcGIS Desktop rather than ArcGIS Pro), you might look into how type hints can be used with Python 2.7 code. Below is a .gif animation to demonstrate how type hints can be used for a function definition or for a variable definition.

static_typing1

Access ArcMap UI from Python comtypes

I have blogged earlier about the possibility to submit calls to methods defined in classes stored within a .NET library either using clr (with the help of pythonnet) or using native ctypes depending on how your libraries have been compiled. I have also shared some of the examples on how one can use comtypes to access COM-objects of ArcGIS Desktop to be able interact with ArcObjects via Python without writing any C#, Java, or C++ code.

This time, I need combine these approaches. I have to execute a tool on a toolbar in ArcMap (this can be a custom .NET add-in or a built-in tool such as Full Extent) from Python code while working in an open ArcMap session. This is different from what I’ve done before because I have always worked with comtypes in a stand-alone mode accessing existing map documents stored on the disk. This time, I have to use comtypes within an ArcMap session hooking into the currently open map document.

Fortunately, this wasn’t as hard as I expected. I just had to use the standard interface to get into the application, and then into the map document, and then into the command bars available. The last thing you would need to do is to find the tool you need and then execute it. I’ve published a snippet that would get you started.

The first one is for calling the Full Extent button on the Tools toolbar in ArcMap. The second one is for calling a button on a custom toolbar that is part of a custom .NET add-in in ArcMap. You can both of these samples as a part of a custom Python script tool or a Python add-in.

 

Dividing a polygon into a given number of equal areas with arcpy

I was recently searching the Internet trying to find any tool that would let me split an arbitrary polygon inside a geodatabase feature class into multiple polygons of equal area. ArcGIS does provide this functionality as a part of the parcel fabric functionality. Unfortunately, there is a lot of work involved in setting up the parcel fabric and there is a lot to learn before you will be able to divide your parcels. So I was looking for a simpler solution that would work directly with the geometry of a polygon.

However, I was not able to find any solution that would work and most helpful posts on the forums were pointing either at parcel fabrics or providing some ideas on the implementation of the workflow using multiple geoprocessing tools and some custom code. I found a nice ArcGIS custom script tool called Polygon Bisector which

computes a line that bisects, or divides in half, a polygon area along a line of constant latitude or longitude.

So, this would work great if you need to split a polygon creating a number that follows the exponent of two (2, 4, 8, 16, 32 and so forth). This is because after dividing a polygon into two polygons of equal area (now you have 2 polygons), you could divide each of them into two parts again (now you have 4 polygons), and so on. Since I want to be able to divide a polygon into an arbitrary number of areas, I had to write my own tool.

I have solved this problem this way. Say I want to have the polygon of area 1000 sq. m. divided into 5 equal areas:

  1. Get an extent of a polygon.
  2. Construct a polyline using the vertices of the polygon’s extent with a tiny shift of coordinates.
  3. Cut the polygon into two halves using this line.
  4. Find what is the area of the smallest polygon.
  5. If the area is smaller than the 200 sq. m. (that is, fifth part of the polygon), the shift the line again and re-run steps 2-4.
  6. If the area is 200 sq. m. or larger, than leave this part and keep working with the polygon that is left essentially running through the steps 2-5.
  7. When the original polygon has been successfully divided into equal areas, they are inserted into a new feature class along with the source polygon attributes.

The illustration of the cutting lines with the extent polygon is below.

extent_figure

This approach has several disadvantages, though.
First, if your polygon is very large and you want the parts of the polygon to have the same area with the minimal difference, the tool execution will take a lot of time as you will need to shift the cutting lines a few centimeters, cut the polygon into halves, and evaluate the result.
Second, you can only choose between the North-South or West-East direction of the cutting lines. You won’t be able to specify the angle yourself.

However, this tool works great for the target use I kept in mind when writing it. Using 0.5 meters as the step for moving the cutting line, the difference between the largest and the smallest sub-polygons was just around 1%. Running the same code for the same polygon with the shift value set to 0.05 meters (5 cm), I have observed the difference value to be around 0.1%.

The illustration of the polygon subdivision is below (West-East to the left, North-South to the right).

divide_polys_figure

The code is available at the GitHub Gist as usual. Run this code inside the Python window in ArcMap while having a single polygon selected.

 

 

Implementing Copy Parallel with ArcObjects

In ArcMap, in the Editor toolbar, you can find a command Copy Parallel which  will create a copy of a selected line feature at an offset distance you specify. This command is unique to the Editor toolbar and is not available as a geoprocessing tool. This makes it impossible to create parallel lines in batch (say, create lines to the both sides of the selected line feature starting with 10 meters up to 100 meters with the step of 20 meters).

A neat thing that exists in ArcMap is that you can assign a shortcut key to any command you can find in the Customize menu > Commands tab window. So, your workflow could look like this: you start the editing session, select a feature, use your shortcut key (e.g., Ctrl-Alt-P) and enter the offset value in the Copy Parallel window that appears. However, this implies that you would need to type an offset value for every copy to be created and you might also need to change other options in the window.

It’s important to understand that creating a parallel copy of a feature in the context of ArcGIS is not the same as creating a feature with a specified offset (because I thought it was). Compare the features produced using the Copy Parallel command (to the left) and using Python code (that just moves each vertex of a feature using an offset value) (to the right):

parallel

As you can see, the length of line segments can change when using the Copy Parallel command where as just moving vertices with offset preserves length of the segments.

You can call the Copy Parallel command using ArcObjects (just like any other command) but you won’t be able to supply any input parameters such as the offset value. So, it’s no different from assigning a shortcut key. The only solution left is to implement an own version of Copy Parallel that will mimic the built-in command. I have written an ArcGIS .NET add-in that can be called from a custom toolbar.

One can specify the intervals in a text box and then click the button.

parallelpara1

This is the code for the combo box item and for the button item: