Printing pretty tables with Python in ArcGIS

This post would of interest to ArcGIS users authoring custom Python script tools who need to print out tables in the tool dialog box. You would also benefit from the following information if you need to print out some information in the Python window of ArcMap doing some ad hoc data exploration.

Fairly often your only way to communicate the results of the tool execution is to print out a table that the user could look at. It is possible to create an Excel file using a Python package such as xlsxwriter or by exporting an existing data structure such as a pandas data frame into an Excel or .csv file which user could open. Keep in mind that it is possible to start Excel with the file open using the os.system command:

os.system('start excel.exe {0}'.format(excel_path))

However, if you only need to print out some simple information into a table format within the dialog box of the running tool, you could construct such a table using built-in Python. This is particularly helpful in those cases where you cannot guarantee that the end user will have the 3rd party Python packages installed or where the output table is really small and it is not supposed to be analyzed or processed further.

However, as soon as you would try to build something flexible with the varying column width or when you don’t know beforehand what output columns and what data the table will be printed with, it gets very tedious. You need to manipulate multiple strings and tuples making sure everything draws properly.

In these cases, it is so much nicer to be able to take advantage of the external Python packages where all these concerns have been already taken care of. I have been using the tabulate, but there are a few others such as PrettyTable and texttable both of which will generate a formatted text table using ASCII characters.

To give you a sense of the tabulate package, look at the code necessary to produce a nice table using the ugly formatted strings (the first part) and using the tabulate package (the second part):

The output of the table produced using the built-in modules only:

builtin

The output of the table produced using the tabulate module:

tabulate

 

 

Advertisements

Using Python start up script for all Python interpreters

This post would be helpful for users of desktop GIS software such as ArcMap who need to use Python inside those applications.

There is a not so well known trick to trigger execution of a Python script before any Python interpreter on your system starts.

Note: If you are a QGIS user, there is a special way of achieving this. Please see the question Script that runs automatically from the QGIS Python Console when QGIS starts, using schedule tasks on Windows 10 for details.

The way to do this is to set up an environment variable called PYTHONSTARTUP in your operating system. You need to do two things:

  1. Create an environment variable that would point to a path of a valid Python script file (.py) with the code that you would like to get executed before any Python interactive interpreter starts. Look at the question [Installing pythonstartup file](https://stackoverflow.com/questions/5837259/installing-pythonstartup-file) for details.
  2. Write Python code that you would like to get executed.

A very important thing to consider is that

The file is executed in the same namespace where interactive commands are executed so that objects defined or imported in it can be used without qualification in the interactive session.

This means that you can do a bunch of imports and define multiple variables which would be available to you directly at the start up of your GIS application. This is very handy because I often need to import the os and sys modules as well as import arcpy.mapping module and create mxd variable pointing to the current map document I have open in ArcMap.

Here is the code of my startup Python script which you can modify to suit your needs. If your workflow relies on having some data at hand, then you might need to add more variables exposed. I have ArcMap and ArcGIS Pro users in mind.

I have included in the example above a more specific workflow where you would like to be able to quickly execute SQL queries against an enterprise geodatabase (SDE). So, when ArcMap has started, you only need to create a variable conn pointing to a database connection file (.sde) and then use the sql() function running your query. Thanks to the tabulate package, the list of lists you get back is drawn in a nice table format.

2018-03-15 12_55_03-PythonWindowArcMap

 

 

Changing selection color in exporting ArcMap layout with ArcPy

If you are an ArcMap user and you export your map layouts into images or .pdf files using arcpy, you may have noticed that the selection symbol stays always the same (the default light green color). I was looking recently for a way to change the selection color that would be respected when exporting map layouts programmatically using arcpy.mapping.ExportToPNG().

Changing a selection color in ArcMap (Selection menu > Selection Options) does change the selection color but it is only respected when exporting the layout manually from the File menu or when exporting the layout in the Python window (it respects the selection color set in ArcMap since I am inside a live ArcMap session). Exporting a layout using arcpy though always uses the default light green color for selection.

I have found out that the arcpy.mapping.Layer object does not expose the selection symbol property according to the docs of the Layer:

Not all layer properties are accessible through the Layer object. There are many properties available in the ArcMap Layer Properties dialog box that are not exposed to the arcpy scripting environment (for example, display properties, field aliases, selection symbology, and so on).

Adding duplicate layers pointing to the same data source and setting the definition queries manually sounds really cumbersome, particularly when I don’t know beforehand what layers I will need to run selection on.

I have solved it in another way.

  1. I pre-create three layers for each of the shape types (point, polyline, and polygon) using the necessary selection symbology (done manually).
  2. Before the export of the map document’s layout, I duplicate the layer that I will run selections on and update its symbology using the .lyr file.
  3. Then I set definition query on the dummy layer that will serve as a selection symbol layer only.

All of this is done programmatically, so the only thing I really need to supply is the name of the layer that selections will run on. It works great and gives me flexibility to use a custom color for selections on any layer inside any map document.

The script source code:

Working with SQL in Jupyter notebook and dumping pandas into a SQL database

I have posted previously an example of using the SQL magic inside Jupyter notebooks. Today, I will show you how to execute a SQL query against a PostGIS database, get the results back into a pandas DataFrame object, manipulate it, and then dump the DataFrame into a brand new table inside the very same database. This can be very handy if some of your operations are better done using plain SQL, for instance, with the help of spatial SQL functions, but other ones are easier to perform using pandas​, for instance, adding and calculating new columns when the data you need to access is not stored inside the database.

The steps are as below:

  • Connect to a database using the SQL magic syntax
  • Execute a SELECT SQL query getting back a result set
  • Read the result set into a pandas DataFrame object
  • Do stuff with the DataFrame object
  • Dump it to a new table in the database you are connected to using the PERSISTcommand

Again, this is very handy for prototyping when you need to produce some tables doing a database design or when you need to have a new temporary table created for some application to read or when running some integration tests and needing to have a mock-up table to work with.

The sample notebook is available as a gist:

 

Adding IPython SQL magic to Jupyter notebook

If you do not use the %%sql magic in your Jupyter notebook, the output of your SQL queries will be just a plain list of tuples. A better way to work with the result sets returned is to draw them as a table with the headers. This is where the IPython SQL magic gets very handy. You can install it using pip install ipython-sql. Refer to its GitHub repository for details of the implementation.

You have to connect to a database and then all your subsequent SQL queries will be aware of this connection and the result sets are also drawn nicely in a table. Another neat feature of the %%sql magic is that you will be able to get the result of your SQL query as a pandas data frame object if you would like to proceed working with the result set using Python. The result object of SQL query execution can be accessed from a variable _. This is because IPython’s output caching system defines several global variables; _ (a single underscore) stores previous output just like the IPython interpreter.

Look into this sample Jupyter notebook for illustration.

 

Patching subprocess module for Python 2.7

Some of the scripts I code in Python still target version 2.7 since this is the Python version that is being shipped with ArcGIS Desktop installation. I have recently found a bug in Python 2.7 subprocess module. This is a module that will let you execute your programs from a Python script. A very typical use case would be to run a console application such as an .exe or a .bat file to do some data processing outside of Python script.

The bug has been registered at bugs.python.org with the id 1759845:

subprocess.call fails with unicode strings in command line

So if you would try to execute the following in Python 2.7 (note the file containing the Unicode é):

import subprocess
subprocess.Popen(['python.exe', u"C:\Temp\file_char_é.py"])

You would get the error message:

UnicodeEncodeError: ‘ascii’ codec can’t encode character u’\xe9′ in position 28: ordinal not in range(128)

This means that you would not be able to use the Python 2.7 subprocess module on Windows if any of your call arguments contain Unicode. I have tried to search for alternatives to the subprocess module, but other packages I found such as envoy only provide a nice wrapper around the subprocess module. This means the bug would still make it impossible to call your programs if any of the arguments contains a Unicode character. Python 3.5 does not have this bug, but since I need to target the Python 2.7, I had to find a workaround.

After many hours of search, I finally found a patched version of subprocess – subprocessww which can be obtained from PyPi.

Version of subprocess with patches to allow unicode arguments. This allows Unicode parameters to be passed to POpen on Python 2.7.

Usage:

# Use patched version of subprocess module for Unicode on Windows
import subprocessww

# Load the regular POpen, which is now patched
from subprocess import Popen

I am very thankful to the The SABnzbd Team for providing the patch. When using the patched version, it is possible to supply arguments containing the Unicode characters.

Desktop PyQt application for executing SQL queries against Esri file geodatabase

As I was always interested in building GUIs for some GIS operations, I thought that exploring PyQt deeper would be fun. A project has started as an experimental playground to see what functionality PyQt provides. As I spent more time working with PyQt, I have started wondering what it would take to build a useful desktop application.

Because I often find myself in need of querying a file geodatabase’s datasets, I have decided to build a GUI-based SQL editor that would let me execute SQL queries against a table or a feature class and draw the result set in a table form for further visual inspection. I have thought that other GIS users and developers may find this application useful and I therefore have decided to start a GDB: GitHub repository to let others take advantage of my work. Here it comes, check it out!

GDBee_sample

GDBee is a PyQt5 desktop application which you can use to write and execute SQL queries against tables and feature classes stored inside an Esri file geodatabase. It provides a tabbed interface which lets you connect to multiple geodatabases within a single session. It has a rich code editor featuring auto-completion (with suggestions), syntax highlight, and result set export.

If you are a QGIS Desktop user, you are already able to execute SQL against file geodatabases using QGIS DBManager plugin, but GDBee has some extra features that the DBManager is missing (for instance, you do not need to add your datasets as layers first and you can choose to copy individual cells instead of the whole row) from the result table.

Because Python is so widely used in the GIS community, I thought it would make sense to take advantage of Python bindings of GDAL (via GEOS) to be able to connect to a file geodatabase and execute SQL queries. Working with a file geodatabase via GEOS makes it possible to take advantage of SQL spatial functions that are otherwise inaccessible to an ArcGIS user!

The application provides multiple features:

  • Working with multiple geodatabases using multiple tabs (single geodatabase connection per tab)
  • Exporting result sets into various formats (WKT strings to paste into QGIS using QuickWKT plugin, arcpy code to paste into ArcMap Python window, pandas data frame via .csv file (which can be taken into geopandas), and Markdown table via .md file or plain text)
  • Executing SQL query with respect to the user selection (only selected text is executed)
  • Loading/saving SQL queries from and to text files on disk
  • Convenient keyboard shortcuts for query execution (F5 and Ctrl-Enter) and tab interaction (Ctrl-N and Ctrl-W for opening and closing tabs)
  • Copying data from the result set table (either individual cell values or row(s) with the headers preserved) – ready to paste properly into an Excel sheet
  • Choosing whether you want to have geometry column in the result set as WKT

You can look at its GitHub repository: GDBee: GitHub repo. You may find this PyQt desktop application useful if:

  • You would like to be able to interrogate your file geodatabase datasets using SQL (instead of Python-based interface such as Esri arcpy or open-source ogr)
  • You are an ArcGIS user that does not want to have QGIS Desktop installed just to be able to execute SQL against a file geodatabase
  • You use SQL on a daily basis working with spatial databases (such as PostgreSQL or Microsoft SQL Server) and want to be able to execute ad hoc SQL queries against file geodatabase datasets without loading them into a proper DBMS database
  • You already have a lot SQL code targeting tables stored in a DBMS spatial database and you would like to be able to reuse this code when targeting a file geodatabase

Do you think there is some other functionality that should be added? Please let me know by submitting an issue in the repository.