Useful resources in computer science/math for GIS Analysts

A lot of people who are studying GIS at school or already working as GIS analysts or GIS consultants often wonder what kind of competence will help to be attractive for employers and what domains of expertise are going to be in demand in the foreseeable future.

Usually the kind of questions GIS professionals ask is how much a GIS analyst should learn from other domains. So, we are wondering how much math, statistics, programming, and computer science should GIS analysts learn. Naturally, knowing what kind of GIS specific expertise is in demand is also very helpful. I have several posts on how get better at GIS here, here, and here.

To know what kind of GIS tools can do what kind of job is definitely helpful. This is much like a woodworker should know what kind of tools he has in his toolbox and what tools are available in the woodworking shop. Finding an appropriate tool for a certain job is not so hard nowadays with the Internet search engine and QA sites. However, the ability to understand both how data processing tools work and what happens behind the scenes to be able to interpret the analysis results is indispensable.

What is often true for many GIS analysts is that during their studies the main focus was on the GIS techniques and tools while math and CS courses were supplementary. This makes sense and the graduates are indeed most often competent GIS professionals capable of operating various GIS software suites, provide user support, and perform all kind of spatial analysis. However, it is also possible that in a career change, a person who hasn’t done any studies on GIS, is working as a GIS analyst and needs to catch up a bit. For those people who feel that they lack background GIS competence that they should had a chance to learn during their studies, or for you who just want to learn something that could help to have a broader view and give a deeper understanding of the GIS, I have compiled a list of useful links and books. Please enjoy!

There are lots of great questions answered on the GIS.SE web site; here is just a few:

Great books:

Spatial Mathematics: Theory and Practice through Mapping (2013)
This book provides gentle introduction into some mathematical concepts with focus on mapping and might be a good book to start learning math in GIS. No advanced background in math is required and high-school math competence will be sufficient.

Table of contents

  • Geometry of the Sphere
  • Location, Trigonometry, and Measurement of the Sphere
  • Transformations: Analysis and Raster/Vector Formats
  • Replication of Results: Color and Number
  • Scale
  • Partitioning of Data: Classification and Analysis
  • Visualizing Hierarchies
  • Distribution of Data: Selected Concepts
  • Map Projections
  • Integrating Past, Present, and Future Approaches

Mathematical Techniques in GIS, Second Edition (2014)
This book gives you a fairly deep understanding of the math concepts that are applicable in GIS. To follow the first 5 chapters, you don’t need any math except high school math. Later on, the book assumes that you have good knowledge of math at the level of a college Algebra II course. If you feel that it gets hard to read, take an Algebra II course online at Khan Academy or watch some videos from MIT to catch up first and then get back to the book. What I really liked about this book is that there are plenty of applicable examples on how to implement certain mathematical algorithms to solve the basic GIS problems such as point in polygon problem, finding if lines are intersecting and calculating area of overlap between two polygons. This could be particularly useful for GIS analysts who are trying to develop own GIS tools and are looking for some background on where to get started with the theory behind the spatial algorithms.

Table of contents

  • Characteristics of Geographic Information
  • Numbers and Numerical Analysis
  • Algebra: Treating Numbers as Symbols
  • The Geometry of Common Shapes
  • Plane and Spherical Trigonometry
  • Differential and Integral Calculus
  • Matrices and Determinants
  • Vectors
  • Curves and Surfaces
  • 2D/3D Transformations
  • Map Projections
  • Basic Statistics
  • Correlation and Regression
  • Best-Fit Solutions

GIS: A Computing Perspective, Second Edition (2004)
The book is a bit dated, but it is probably the best book in computer science for a GIS professional. It provides very deep understanding of the computational aspects that are used in GIS.

Table of contents

  • Introduction
  • Fundamental database concepts
  • Fundamental spatial concepts
  • Models of geospatial information
  • Representation and algorithms
  • Structures and access methods
  • Architectures
  • Interfaces
  • Spatial reasoning and uncertainty
  • Time

Practical GIS Analysis (2002)
This book is a unique example of a book for GIS professionals who want to see how the basic GIS algorithms and tools work. The exercises that follow give readers a chance to execute many common GIS algorithms by hand which let truly understand even some complex operations such as generating TIN or finding the shortest path on a street network. The software used as a reference is ArcView GIS 3, but it is still relevant as the GIS concepts haven’t changed much since then.

Table of contents

  • GIS Data Models
  • GIS Tabular Analysis
  • Point Analysis
  • Line Analysis
  • Network Analysis
  • Dynamic Segmentation
  • Polygon Analysis
  • Grid Analysis
  • Image Analysis Basics
  • Vector Exercises
  • Grid Exercises
  • Saving Time in GIS Analysis

Maths for Map Makers (2004)
I haven’t read this book so don’t have anything to comment on this. Sorry!

Table of contents

  • Plane Geometry
  • Trigonometry
  • Plane Coordinates
  • Problems in Three Dimensions
  • Areas and Volumes
  • Matrices
  • Vectors
  • Conic Sections
  • Spherical Trigonometry
  • Solution of Equations
  • Least Squares Estimation
  • References
  • Least Squares models for the general case
  • Notation for Least Squares

Exploring Spatial Analysis in GIS (1996)
I haven’t read this book either. I guess this one might be hard to find, but have listed it here just in case.

Good luck with the readings!

Publishing Python scripts as geoprocessing services: best practices

Why Python instead of models?

If you have been publishing your ModelBuilder models as geoprocessing (further GP) services you have probably realized that it can be quite cumbersome. If you haven’t moved to Python, I think you really should. Authoring Python scripts has serious advantages over authoring models in the context of publishing GP services. This is because during the publishing process, ArcGIS Server will turn data and anything that may be needed to change into variables and this might mess up the model if you haven’t followed the guidelines on authoring GP services. The rule of thumb for me was that if there are more than 10 objects in the model, it is a good time to switch to Python. Another thing is that you can easily make modifications in the Python code without republishing; in contrast, you need to republish the model each time you want to release an updated version of the GP service. Finally, since you don’t need to restart the GP service when updating the Python file (in contrast to republishing the model which requires restarting service), there is no down-time for the service and users won’t notice anything.

What happens after publishing?

Let’s take a look at what is going on under the hood. You have run your script tool in ArcMap and got the result published as a service. Now you can find your service and all the accompanying data inside the arcgisserver folder somewhere on your disk drive. The path would be: C:\arcgisserver\directories\arcgissystem\arcgisinput\%GPServiceName%.GPServer

You will find a bunch of files within the folder. Let’s inspect some of them:

  • serviceconfiguration.json – provides an overview over all the properties of the service including its execution type, enabled capabilities, output directory and many others. Here you will see all the settings you usually see in the Service Editor window.
  • manifest.xml and manifest.json – provides an overview of the system settings that were used while publishing the service. Those are not the files you usually would want to inspect.

Inside the folder esriinfo/metadata there is a file named metadata.xml which is really helpful because there you can see what date a service was published. Two tags you should look at are:

  • <CreaDate>20141204</CreaDate>
  • <CreaTime>15443700</CreaTime>

Since this information is not exposed from the GUI in ArcGIS Desktop or ArcGIS Server Manager, this is the only way to find out what time the service was created. This information may be very handy when you are unsure about the release versions.

Inside the extracted/v101 folder, you will find the result file and the toolbox you have worked with when publishing the GP service. Here you will also find a folder named after the folder where you source Python file was stored and containing the source Python file.

Best practices to organize the Python code and files?

Let’s look inside the Python file. You might have noticed that when publishing some of the variables you’ve declared were renamed to g_ESRI_variable_%id%. The rule of a thumb is that you shouldn’t really use strings; you can turn paths to datasets and names into variables. Of course you don’t have to do this since Esri will update those inline variables, but it is so much harder to refactor with those variable names, so you better organize your code correctly from the beginning.

If running the script tool in ArcGIS, scratch geodatabase is located at C:\Users\%user%\AppData\Local\Temp\scratch.gdb. However, after publishing the tool, the service will get a new scratch geodatabase. If you need to inspect the intermediate data created, go to the scratch geodatabase (the path can be retrieved with the arcpy.env.scratchGDB) which will be a new file geodatabase in each run of GP service with the following notation: c:\arcgisserver\directories\arcgisjobs\%service%_gpserver\%jobid%\scratch\scratch.gdb.

Keep in mind that GP service will always use its local server jobs folder for writing intermediate data and this behavior cannot be changed. But having the service writing to the scratch workspace is actually a lot safer than writing to a designated location on disk. This is because there is no chance of multiple GP service instances trying to write to the same location at the same time which can result in dead-locking and concurrency issues. Remember that each submitted GP job will be assigned to a new unique scratch folder and geodatabase in the arcgisjobs folder.

Make sure you don’t use arcpy.env.workspace in your code; always declare a path variable and assign it to be the folder or a geodatabase connection. For the datasets path, use the os.path.join() instead of concatenating strings. For performance reasons, use in_memory workspace for intermediate data with the following notation:

var = os.path.join(in_memory,"FeatureClassName")

You can take advantage of using in_memory workspace, but for troubleshooting purposes it might be better to write something to disk to inspect later on. In this case, it might be handy to create a variable called something like gpTempPlace which you can change to be either “in_memory” or a local file geodatabase depending whether you run clean code in production or troubleshoot the service in the staging environment.

Make sure you don’t use the same name for variable and feature class/field name. This sometimes leads to unexpected results when running the service. It might be helpful to add “_fc” in the end for feature class variable and “_field” for the field variable. This way, you will also be able to distinguish them much easier. The same is applicable for the feature class and feature layer (created with Make Feature Layer GP tool) names.

Remember that you can adjust the Logging level of ArcGIS Server (in Manger) or GP service only (done in Service Editor window) for troubleshooting purposes. It is often useful to set the Message Level setting to Info level before publishing the GP service into production because this will give you the detailed information what exactly went wrong when running the GP service. You can access this information either from the Results window in ArcMap or from the ArcGIS Server logs in Manager.

How do I update the published service?

A special word should go for those users who needs to publish the GP services often while making changes in the code. It is important to understand that after publishing the GP service, the copied Python code file and the toolbox don’t maintain any connection to the source Python file and the source toolbox you have authored. This implies that after making edits to the script tool, you need to push those changes to the published service on server.

There are two types of changes you can make on your source project: the tool parameters in the script tool properties and the Python code. Keep in mind that you cannot edit the published toolbox; so if you added a new parameter or modified existing parameter data type, you would need to republish the service. However, if have only modified the Python script source code, there is no need to republish the whole service as you only need to replace the contents within the Python file.

Even though you can automate service publishing workflow with Python, it still takes time to move the toolbox and the Python code files. Therefore, you can save a lot of time by finding a way to replace the Python code in the published Python code file. In order to update the Python code file, you have really just two options – you either copy and replace the published file with the updated source file or copy/paste the source code. This approach may work if you have a tiny Python script and all the paths to the data on your machine and on the server are the same. This can be a plausible solution when you have Desktop and Server on the same machine. If you have a configuration file where the Python source code file gets all the paths and dataset names from, you could also safely replace the published Python file. However, this is still an extra thing to do.

The best practice I came to while working for two years on the GP services is to split the code files and the tool itself. Let me explain.

How do I separate the toolbox logic and Python code?

Create a Python file (a caller script) which will contain all the import statements for the Python modules and your files. By appending the path to your Python files, you will be able import the Python files you are working on; this is very useful when your GP service consists not just of one file yet of multiple modules.

import sys
import socket
sys.path.append(r'\\' + socket.gethostname() + "path to code files")
import codefile1 #your Python file with business logic
import codefile2 #your Python file with business logic

This file should also include all the parameters which are exposed in the script tool.

Param1 = arcpy.GetParameterAsText(0)
Param2 = arcpy.GetParameterAsText(1)
Param3 = arcpy.GetParameterAsText(2)

Then you define a main function which will be executed when running the GP service. You call the functions defined within the files you imported.

def mainworkflow(Param1,Param2):
    Result = codefile1.functionName(Param1,Param2)
return Result

It is also handy to add some parameter handling logic; when you run the script tool in ArcMap, you supply some values for the tool which will become default values visible when users will execute GP service from ArcMap or from any other custom interface. In order to avoid that, you can just leave those parameters empty and then return empty output for GP script tool publishing only purposes.

if Param1 == '' and Param2 == "":
    Result = ""
else:
    Result = mainworkflow(Param1,Param2)

Create a script tool from this caller Python file defining the parameters and their data types. After the tool will be published as a GP service, you can work with the Python files which will contain only the code that actually does the job.

After performing and saving the changes in the code, feel free to run the GP service directly – the caller Python file (published as a GP service) will import the codefile1 at the folder you specified and run the code. There is no need to restart the GP service or re-import your update module.

Which IDE should I choose for Python development for ArcGIS?

A bit of history…

I have started writing Python scripts in 2011 and my first IDE was PyScripter. I quite liked it first because this is so much better than Notepad… but after some time I realized that I get frustrated over some things such as inability to keep working on my code while the debug process was running and crashes that happened now and then. Some other useful features I was using in Visual Studio were not available either.

Choosing an IDE for Python might be hard especially if you haven’t used any earlier. However, I heard that Wing IDE is being used in Esri by quite a few Python developers. So, I switched to Wing IDE in 2014 and really liked it from the very beginning; it has a very clean UI and it is very easy to customize its appearance.

The intellisense (or code autocompletion) is working fine and I get most of my arcpy module objects in the suggestions. Some classes such as arcpy.da are implemented with CPython with no Python source wrapping it, IDEs cannot get the autocompletion for this one. However, while working with arcpy I got all the autocompletion I really need. It is very easy to switch what Python interpreter should be used for a certain file or a project (32-bit Python will fail to process large datasets; after choosing 64-bit Python if you have it installed as your Python executable you will be able to handle large GIS data with no problem provided that you have enough RAM on your machine. I’ve done some routing data (many GBs) processing and Python process was eating up around 12GB of RAM.)

Organizing your projects will be way easier because you can add many other files of other formats, such as SQL or HTML. Wing provides a great way to organize your datasets, documentation, and the code. It is capable of finding differences between two Python files in interactive mode. This is something you will definitely want to use when comparing several versions of the same file or a script you were working on.

As you see, it has many useful features which makes coding a lot more efficient and in fact pleasant. Here are some of them I use all the time:

Wing Source browser

wing_sourceThe PyScripter also has a similar window called Code Explorer but it is not as robust as the Wing’s Source Browser. I could build a very nice function calls tree using the Wing while working with some large Python module I’ve inherited. You can go to the place in the code where an object was used. This can be really helpful when refactoring some legacy code or when trying to build reference materials for yourself or peer developers.

Wing Debug probe

I use this window when writing new code and trying to find a bug in existing one. I usually run the program until the very end leaving a couple of clean-up rows, then fire up the Debug Probe and start working forward. Because your code is evaluated on-the-fly, there is no need to re-start the whole debugging process re-running the Python file.

wing_probeBecause the workflow you might work with can be executed on some large datasets, it will be very efficient not to run the program too often. After I’ve verified that the code I wrote is correct, I copy-paste it in the Python file. You can also run just a portion of your code in the Debug probe, which can be very useful for evaluating just a few rows.

You bring up the Debug Probe and press the + icon in top right near the Options menu to lock a range of lines into being the Active Range, which you can then execute by pressing the cog icon that appears in the shell.

Wing Source assistant

wing_assistUsing Wing, I found that I don’t go to the ArcGIS Help that often as I did when I was using PyScripter. Wing provides an interactive way to get the syntax and usage tips for a function or a class you use in any imported module. As you type the name of the tool, the Source Assistant window gets updated and you can see all the information about this tool. So, all the GP tool’s help with the input parameters and valid options are available right in the IDE window.

Some other key points
  • It is not expensive. This is reasonable price for a very good piece of software.
  • Wingware provides me with excellent and fast support. Always helpful and prompt.
  • It is very easy to authorize the software offline without any hustle. It is just about copying/pasting the license code. You are allowed to install the Wing you purchase on multiple machines, read more about licensing terms here.
  • The company trusts their users. You can just install it on a new laptop you get or on virtual machine you do your coding using the same license you have. When your license expires, the Wing will run for 10 minutes at a time without any license or activation at all, or a trial license period can be used until any license problem is resolved.
  • Wing has UI that just works. It starts fast, it has easy to navigate panels and popups. The software was always responsive and never crashed since I’ve started using it last year.

Come on, go and get yourself the Wings to fly and code like a pro!

How to be efficient as a GIS professional (part 3)

6. Automate, automate, automate

Whatever you are doing, take a second to think whether you will need to run the sequence of steps you’ve just completed again. It may seem first that you are very unlikely to run the same sequence of steps again, but in fact you may find yourself performing them over and over again later on.

Automating is not only about saving the time. It is also about the quality assurance. When you are doing something manually, there is always a chance to forget a certain step or detail which can potentially lead to an error. When having the workflow automated, you can always see what steps are being performed. An automated workflow is already a piece of documentation which you can share with others or use yourself as a reference.

Don’t trust your memory: you think you know what columns you’ve added to the table and why, yeah. Get back in two weeks and you will be surprised by how much of those memories you have left. If you will leave the job and get the work over to a new person, she will be happy to inherit a well maintained documentation and discrete description of the workflow he will be responsible for.

Considering desktop GIS automation, think about using Python for geospatial operations (think truncating tables + appending new data + perform data checks). For database automation, use SQL (add new columns + alter columns data type). Feel free to build SQL scripts with commands for adding/deleting/calculating columns and copying data, too. By preserving those scripts, you will always be able to re-run them on a another table, in another database or modify the script to match your needs. This gives you a way into looking at changes performed in your database. This is just like adding a field manually and then writing down that you have added field of type X into table Y at time Z. It is just so much easier to build a SQL script to avoid doing that.

7. SQL, SQL, SQL

Another advantage of the SQL for data processing is that it is very vendor neutral and can be executed either as is or with really minor adjustments on most DBMS platforms. This is applicable to SQL spatial functions which provide ISO and OGC compliant access to the geodatabase and database, too. Being able to execute SQL queries and perform data management operation is really advantageous when you work in a large IT environment. This might be helpful because you won’t always have the network connection to the production environment for data update and using ArcGIS might not be possible. Running a Python script would require having the Python installation on some machine and if you use arcpy – ArcGIS Desktop. Running a SQL code which has no dependencies might be your only alternative.

Many folks don’t know that one can use pure SQL with an enterprise geodatabase stored in any DBMS supported. This is just a short list of what you can do with SQL:

8. Python, Python, Python

I have blogged about using spatial functions of SQL Server earlier. Remember that you can also execute some of the SQL from Python code when using the arcpy.ArcSDESQLExecute class. Here is the SQL reference for query expressions used in ArcGIS some of which you can use in the arcpy.da cursors where clauses. Learn some of the useful Python libraries which could save you some time. Look at:

  • Selenium for automating ftp data download if this happens often and you have to browse through a set of pages;
  • scipy.spatial module for spatial analysis such as building Voronoi diagrams, finding distances between arrays, construct convex hulls in N dimensions and doing many other things;
  • Numpy, a fundamental package for scientific computing with Python, for handling huge GIS datasets (both vectors and rasters) with arcpy.

Read more about What are the Python tools/modules/add-ins crucial in GIS and watch an Esri Video on Python: Useful Libraries for the GIS Professional.

Get a chance to learn more about the SQL and Python and how you could take advantage of them in your work!

Build ArcGIS network dataset from OpenStreetMap

I have blogged previously on how you can get street data for use in the ArcGIS Network Analyst. If you have obtained TomTom or Nokia (Navstreets) data, you can easily build a network dataset (further ND) by using the Esri SDP toolbox which I have blogged about earlier.

If you don’t have any other sources for the data, consider using the OpenStreetMap (OSM) data if it is applicable in your business case. I have blogged earlier on how to get OSM data into ArcGIS network dataset, but this approach is outdated and I recommend another way to build the network. The overall workflow is fairly straightforward:

Download OSM data

Go to the OSM home page tab and choose area to download. You can either draw a rectangle or specify the bounding box coordinates. If the area you choose will be too large, you will have to use one of the sources listed at the left panel for bulk data downloads. Clicking the Overpass API link will trigger downloading the map file with no extension. Rename it by adding the .osm extension.

2015-02-20 15_46_54-OpenStreetMap _ ExportInstall ArcGIS Editor for OSM

Now you have to download the ArcGIS Editor for OSM, either 10.0, 10.1 or 10.2.x Desktop version. The installation file will install the libraries required as well as a geoprocessing (further GP) toolbox tools of which you will access later on. Read through the documentation on how to build a ND from OSM data on the ArcGIS OSM Editor home page. After installing, you should find the OpenStreetMap Toolbox in your ArcToolbox folder in ArcGIS.

Load OSM file into geodatabase

Start by running the Load OSM File GP tool. Please activate the Conserve Memory option if you have a large OSM file (larger than the amount of RAM), because during this process all nodes are going to be fetched. If you fail to do so, the process might crash. I’ve hard time to process some large files on a 8GB virtual machine partly because of the Windows paging 2GB limit. Running the processing from the 64bit Python might help, but this is something I have not tested yet. I remember that some of network data processing algorithms I have developed failed on building adjacency matrix for a network with 15 million edges when running with 32bit Python, but completed with no problems when running under 64bit Python taking almost 10GB of RAM on my machine.

Build a network dataset

When the data will be loaded into a feature dataset, you are ready to build a network dataset. You will need the Create OSM Network Dataset GP tool for that. You will need to provide a Network Configuration File which you can find in the C:\Program Files (x86)\ArcGIS\Desktop10.1\ArcToolbox\Toolboxes\ND_ConfigFiles provided. This is an XML file which provides parameters for interpreting your road types data into edge cost evaluators. The DriveGeneric.xml is for a generic motorcar routing network, but there is another one which can be used for cycling networks. There is one more file there – DriveMeters.xml. This configuration offers faster runtime performance (less Script evaluators), but will only work with coordinate systems that have a linear unit of meters. Let the tool to run as it might take a lot of time if you have a large dataset. After the ND is built, feel free to modify its properties and test how it works.

OSM_serviceareasI suggest start by downloading a small area to verify the tools are working as expected. The map.osm file you download should not be larger than 20MB. After you have verified the workflow, feel free to try larger datasets. There are some other useful tools in the ArcGIS OSM Editor toolbox which you might want to explore. There are some for designing maps based on the OSM data and loading data into PostgreSQL database.

Building custom UI tools for ArcGIS with Python

I often see people looking for a way to extend ArcGIS software: some need an extra tool that is missing in the core product, for others it is about integration with an existing system or application. A good part of users want to have custom dialogs and UI elements embedded as a part of geoprocessing tool dialog window. In this post, I have tried to summarize the options you have for customizing ArcGIS including developing new features on top of the core product.

ArcGIS-based solutions (script tools + Python add-ins)

If you develop a geoprocessing tool and have a Python script, you can make a custom script tool which will have the GUI any other core geoprocessing tool has. There are panels and boxes with Browse buttons, you can work with drop-down lists, check boxes, multi-value tables and many others. Read through all the parameter types you have (you can let users click on the map, draw features and use those features in the analysis and many other advanced features). I am sure quite few of you have not known of this rich functionality.

You can embed your script as a script tool in a custom geoprocessing toolbox and as a Python toolbox. There are two great posts to review: Comparing custom and Python toolboxes and Why learn/use Python Toolboxes over Python Script Tools? to learn more when to use which. If you are just starting with the ArcGIS, consider testing script tools first before playing with Python toolboxes. Setting up a script tool without using a Python toolbox might be much easier for a beginner.

As a last resort, if you want your end users will be able to have a custom dialog box when they will run your tools plus some additional parameter handling, consider embedding your Python script tool into a custom C++/.NET tool which might provide some additional GUI features, but you will be limited to the GP tools GUI scope anyway. I am urged that it is not a good idea to invest into developing with ArcObjects since this technology has a very steep learning curve and will eventually become obsolete as ArcGIS Pro and its .NET SDK will gain popularity. Moving ArcObjects code into ArcGIS Pro is not supported and therefore in my opinion it is better to stay with Python unless you really have to develop something special on top of ArcGIS right now.

Keep in mind that you have Python add-ins which provide additional functionality with the windows, messages and dialogs. They are easy to build and distribute and if you are familar with Python and arcpy, you can start developing them in no time at all.

Desktop app / embed external GUI into a toolbox tool

If you want to develop a stand-alone application (such as .exe file for Windows), you would need to convert your Python script into a an .exe file with any utility such as py2exe. In order for this script to run, you would need to have ArcGIS installed on the machine because it will need to use arcpy site-package which is installed when installing ArcGIS.

As for the custom GUI, you have various Python libraries such as Tkinter (which is shipped with the core Python installation), PyQt/PySide (free Qt bindings), wxPython (wxWidgets of C++ library), and Kivy (a great cross-platform library with rich UI). I have tried them all and liked PyQt most. Here is a couple of GIS.SE resources to learn from:

Because of the ArcMap architecture, you might have troubles running custom GUI in the same process as ArcMap. I’ve seen some examples with Tkinter, but in general there are many issues to tackle with.

From what I’ve experienced I can say that it is probably better either to stay with the core GUI interface which provides in most of the time everything you’d ever need or develop a custom application importing arcpy (with some extra tuning in configuration) and working with the custom GUI (such as developed with PyQt) without starting any ArcGIS application at all. There is an ArcGIS Idea Form Builder for Python Tools, but it is hard to say if this going to be implemented any soon, so you better search for other alternatives.

I’ve done some tests embedding a custom Python script into a toolbox in ArcGIS Pro 1.0 invoking the PyQt 4 script and there were no problems setting up this and running. If using ArcGIS Pro is an option for you, you might consider this – it will be much easier to embed custom Python tools with own GUI into Pro than ArcMap. One of the gotchas is that Pro uses Python 3.4 and it is 64-bit Python which has certain implications for compatibility with PyQt or any other platform of your choice.

SQL Server spatial functions for GIS users

If you have been using SQL Server for some time, you’ve probably heard of the spatial data support. This might be particularly interesting for anyone who is using any desktop GIS for data management and analysis. If you are an ArcGIS user and have enterprise geodatabases stored within SQL Server databases, you might have wondered whether it is possible to interact with the spatial data. This is useful when you don’t have a chance to use ArcMap to access the database due to some restrictions (permissions, network connections or software compatibility).

Well, you actually can do a whole lot with your geographic data just with SQL. It is important that you define the Shape field as of the Geometry/Geography data type. For most of the GIS work, you’d probably choose Geometry type which represents data in a Euclidean (flat) coordinate system. As soon as you have a geodatabase feature class which has the Shape field defined as of Geometry type, you can use native SQL Server tools to interact both with the feature class attributes and geometry.

Beginning with ArcGIS 10.1, feature classes created in geodatabases in SQL Server use the Microsoft Geometry type by default. To move your existing feature classes to the Geometry storage type, use the Migrate Storage geoprocessing tool or a Python script.

Alright, so after you have copied your file geodatabase feature class into a SQL Server geodatabase, you are ready to use native SQL to interact with the spatial data.

Let’s select all the features from the Parcels feature class.

SELECT * FROM dbo.PARCELS

Because we have a SHAPE column that of Geometry type, we get another tab in the results grid – Spatial results. There you can see your geometries visualized.

Microsoft SQL Server Management Studio

Let’s see what coordinate system our feature class was defined in.

DECLARE @srid INT = (SELECT TOP 1 shape.STSrid FROM dbo.PARCELS)
SELECT @srid AS SRID,
srtext AS Name FROM sde.SDE_spatial_references WHERE auth_srid = @srid

Here we use <GeometryColumnName>.STSrid to get the spatial reference id (SRID) of the coordinate system of the first feature. Because our geographic data is stored in a projected coordinate system (and Geometry type), we cannot get its name by using core SQL Server spatial references table, sys.spatial_reference_systems.

Here is why:

The coordinate systems in this table are for the geography type only as it contains information about the ellipsoid that is required to perform calculations. No such information is required to perform calculations for projected coordinate systems on the plane used by the geometry type, so you are free to use any reference system you like. For the calculations done by the geometry type, it is the same no matter what you use.

Let us explore next what kind of geometry is stored within a table. It is possible to store different types of geometry (such as polygon and polyline) within one table in SQL Server.

Let us see if it is true:

SELECT Id,GeomData AS Geometry,GeomData.STAsText() AS GeometryData
FROM [testgdb].[dbo].[GeneralizedData]

Microsoft SQL Server Management Studio

Yes indeed we store in one table features of different geometry and SQL Server has no problems with that. To be able to visualize this table in ArcMap though, you would need to use a query layer which is basically stand-alone table that is defined by a SQL query. ArcGIS can only handle having one type of geometry stored within each feature class which is why you will get a choice to pick what type of geometry do you want to look at.

New Query Layer After adding this layer into ArcMap, you will be able to see the polygons (provided you’ve chosen the polygons). The query layer is in read-only, so you cannot edit features in ArcMap. If you have a SQL Server table (non-registered with geodatabase) with multiple types of geometries stored, you will be able to switch easily between by adding multiple query layers into ArcMap defining what kind of geometry you want to work with.

Let us keep working with a geodatabase feature class which has only polygons. Let’s check if it’s true:

SELECT Shape.STGeometryType() AS GeometryType FROM dbo.PARCELS

Alright, so we know already what kind of coordinate system the data is stored in and we know that there are polygons. Let us get the perimeter (length) and the area of those polygons.

SELECT PARCEL_ID, SHAPE.STArea() AS Area,
SHAPE.STLength() AS Perimeter
FROM dbo.PARCELS 

Microsoft SQL Server Management StudioWe can also get the coordinates of each polygon within our feature class. Note that that the start point and the end point are identical – that is because each polygon is considered to be closed:

SELECT Shape.STAsText() AS GeometryType FROM dbo.PARCELS

This is what we will get:
POLYGON ((507348.9687482774 687848.062502546, 507445.156252367 687886.06251058145, 507444.18750036607 687888.56250258372, 507348.9687482774 687848.062502546))

There are similar functions such as .STAsBinary() which returns the Open Geospatial Consortium (OGC) Well-Known Binary (WKB) representation of a geometry instance and .AsGml() which returns the Geography Markup Language (GML) representation of a geometry instance.

We can also check the number of vertices per polygon:

SELECT PARCEL_id,
Shape.STAsText() AS GeometryDesc,
Shape.STNumPoints() AS NumVertices
FROM dbo.PARCELS
ORDER BY NumVertices DESC

Microsoft SQL Server Management StudioAlright, that was probably enough querying data. Let us check what kind of GIS analysis is available to us with native SQL. The easiest way to get started is probably to process the features of a feature class and then write the resultant geometries into a new table.

DROP TABLE [ParcelEnvelope]
CREATE TABLE [dbo].[ParcelEnvelope]([Id] [int] NOT NULL,
[PolyArea] int,[GeomData] [geometry] NOT NULL) ON [PRIMARY]

INSERT INTO ParcelEnvelope (Id,GeomData,PolyArea)
SELECT PARCEL_ID AS Id,
SHAPE.STEnvelope() AS GeomData,
SHAPE.STArea() AS PolyArea
FROM dbo.PARCELS
ORDER BY OBJECTID

This will create a new table where the envelopes of each parcel polygon will be written to.

Feature envelopeLet us do some buffers on road centerlines geodatabase feature class:

DROP TABLE [RoadBuffer]
CREATE TABLE [dbo].[RoadBuffer]([Id] [int] NOT NULL,
[GeomData] [geometry] NOT NULL) ON [PRIMARY]

INSERT INTO [RoadBuffer] (Id,GeomData)
SELECT OBJECTID AS Id,
SHAPE.STBuffer(50)
FROM dbo.Road_cl
ORDER BY OBJECTID

You can of course write newly generated features into a geodatabase feature class, not just a SQL Server database table. You need to create a new polygon feature class and then run the SQL below. This will create buffer zones for every line found in the Road_cl feature class.

DELETE FROM FC_ROADBUFFERS
INSERT INTO FC_ROADBUFFERS(OBJECTID,SHAPE)
SELECT OBJECTID AS OBJECTID,
SHAPE.STBuffer(50) AS SHAPE
FROM dbo.Road_cl
ORDER BY OBJECTID

Please refer to the Microsoft Geometry Data Type Method Reference to get a full list of available functions and more detailed description.

Try doing some other analysis such as finding what features intersect or overlap or how many points are located within a certain polygon. There is so much you can do! To learn more, get a book Beginning Spatial with SQL Server 2008 which has tons of examples and will also help you understand the spatial data structure basics. I have read this book and really liked it. I think it is a must read for anyone using spatial SQL.

I hope this short introduction into what you as a GIS user can do with SQL Server will help you take advantage of using the native SQL functions wherever using a desktop GIS is not an option.