Stress testing ArcGIS Server with Python

I’ve blogged earlier about using Apache JMeter for stress testing ArcGIS Server and for performance tests. It’s a terrific tool that can do so much more than that though.

In case you won’t be able to use the JMeter, you can perform a similar stress test on an ArcGIS Server service by using Python multiprocessing module as well as ArcREST Python package – an open source package developed by Esri and the user community and hosted on GitHub.

In this code, a pool of workers is created and each of those does the export of the map image which triggers creating an instance, if there aren’t any available, which use certain amount of RAM and CPU. To learn more about the ArcREST, please visit the Wiki page of the project. To learn more about the Python multiprocessing, check the Help page and some SO pages here and here.

For live monitoring of the ArcGIS Server service busy instances, a sample from Esri ArcGIS Server development team could be used. Another application built on top of that sample can be found here on GitHub.

Essential extra toolsets for ArcGIS Desktop professionals

If you are an ArcGIS users you could at some point of time find out that you lack a certain tool which you need to perform a certain analysis. Or you just need to hack some datasets really quickly and you are lazy to write your own tool for that. If you are not developer, though, the only option you would have is to see whether a model can be built in ModelBuilder to replicate the required operation (by combining multiple geoprocessing tools).

If ModelBuilder wouldn’t help you, then there is a good chance you would start searching for this kind of tool on the Internet. A great place to start is Google. The GIS Stackexchange web site is indexed very well and you will be able to find many of the tools you are looking for as answers to the questions on this web site too.

Another place to search for tools is This one has replaced arcscripts and if you are on ArcGIS 9.3+ you probably wouldn’t go to arcscripts at all. When at, you don’t need to sign in to search for tools. Remember to enable the Show ArcGIS Desktop Content option and filter the results to show Tools only (figure below). searchHere is a great reference page on how to do efficient searches:

In my “GIS Analyst toolbox”, I have collected many useful tools over last years. Some of them for older versions of ArcGIS, but in some cases they will just work; in others – you can get an idea on how tools are implemented and re-write to run on your version (if you can program of course or use ModelBuilder). Here is the short list that includes some of them:

SampleArcPyMappingScriptTools_10_v1 (ArcGIS toolbox, Python code)
These (~20) tools were created as representative samples for how arcpy.mapping could be used to perform a variety of tasks. A must have for anyone who works with map documents often! It will also let you get started with arcpy.mapping module in no time, awesome!

Cartography Tools

  •    Adjust Layout Text Width (from ArcMap)
  •    Find and Replace a Text String
  •    Page Layout Element Report
  •    Shift Page Layout Elements
  •    Update Symbology

Export and Printing Tools

  •    Append PDF Documents
  •    Export Map Documents to PDF
  •    Print Data Driven Page(s)
  •    Print Map Document(s)

MXD and LYR Management Tools

  •    Add Layer File into MXD (from ArcMap)
  •    Find Broken Data Sources (Report)
  •    Find Data Source (Report)
  •    Find Layers Projected on the Fly (Report)
  •    Multi Layer File Summary (Report)
  •    Multi MXD Summary (Report)
  •    Replace Layer with Layer File
  •    Replace Layer with Layer File (from ArcMap)
  •    Update MXD from pGDB to fGDB
  •    Update MXD tags

Tools that will let you create your own cartographic effects
Lots of useful tools for any cartographer who wants to add some extra mapping features.

Database inspector (ArcGIS toolbox, Python code)
This is a must have for any ArcGIS analyst doing any Python development for geodatabase maintenance and geodata management. This is a suite of tools for analyzing components of a geodatabase and for finding the differences between geodatabase schemas. Can print properties of fields, feature classes, relationship classes, domains, tables. Can compare particular geodatabase components between two workspaces. Can compare two geodatabases and tell you the difference. A great tool I use daily.

ArcGIS Server ServerAdminToolkit 10.1+ (by Kevin Hibma from Esri) (ArcGIS toolbox, Python)
These tools perform some common administrative tasks with an ArcGIS Server machine. All of these tasks can be accomplished through the UI (ArcMap), the Web Manager or the REST Administration page. By using tools you can automate redundant workflows or chain common workflows together. Most of these tasks, turned into tools, have more detailed explanations in the help. This package is composed of three main parts: Tools, Standalone executable, and Code.

ArcREST (ArcGIS toolbox, Python)
A set of python tools to assist working with ArcGIS REST API for ArcGIS Server (AGS), ArcGIS Online (AGOL), and ArcGIS WebMap JSON. An amazing package that any ArcGIS Online or ArcGIS Server admin wants to have! Nearly all things you can do with the REST API, you can do with ArcREST. A must have.

Spatial Analyst Supplemental Tools
A collection of script tools to supplement Spatial Analyst Tools.

  • Create Dendrogram
  • Draw Signature
  • Erase Raster Values
  • Filled Contours
  • Maximum Upstream Elevation
  • Peak
  • Tabulate Area 2
  • Viewshed Along Path
  • Zonal Statistics As Table 2

Geomorphometry & Gradient Metrics (ArcGIS toolbox, Python)

Urban Network Analysis Toolbox for ArcGIS (Python)
The tools incorporate three important features that make them particularly suited for spatial analysis on urban street networks.

National Water-Quality Assessment (NAWQA) Area-Characterization (ArcGIS toolbox, Python)
From: and
The toolbox is composed of a collection of custom tools that implement geographic information system (GIS) techniques used by the NAWQA Program to characterize aquifer areas, drainage basins, and sampled wells.

X-ray for ArcGIS (ArcMap, ArcCatalog) (add-in)
From: and
The X-Ray for ArcCatalog add-in can be used to develop, refine and document your geodatabase designs. The X-Ray add-in for ArcMap can be used to document the properties of your map documents (MXDs).

Marine Geospatial Ecology Tools (MGET) (ArcGIS toolbox, Python)
A free, open-source geoprocessing toolbox that can help you solve a wide variety of marine research, conservation, and spatial planning problems. MGET plugs into ArcGIS and can perform tasks such as:

  •    Accessing oceanographic data from ArcGIS
  •    Identifying ecologically-relevant oceanographic features in remote sensing imagery
  •    Building predictive species distribution models
  •    Modeling habitat connectivity by simulating hydrodynamic dispersal of larvae
  •    Detecting spatiotemporal patterns in fisheries and other time series data

Favorite tools and resources for cartographers
A compilation of some of the most popular tools and sources of information about maps and cartographic design.

Geospatial Modelling Environment (known as HawthsTools)
GME provides you with a suite of analysis and modelling tools, ranging from small ‘building blocks’ that you can use to construct a sophisticated work-flow, to completely self-contained analysis programs. It also uses the extraordinarily powerful open source software R as the statistical engine to drive some of the analysis tools. One of the many strengths of R is that it is open source, completely transparent and well documented: important characteristics for any scientific analytical software.

Design of WebGIS back-end: architecture considerations

I have spent last two years doing a lot of Python development and designing and implementing Web GIS which included ArcGIS Server, geoprocessing services and ArcGIS API for JavaScript (further JS) web client. What I would like to do is to share an idea which I got to like.

If you need to do something, try doing it at the back-end

Imagine you have a JS web application where users will work with some feature services via a web map. They can select multiple features and calculate the sum of the values features have in a field (or fields). Let’s go through alternatives you have now.

  1. Pre-calculate the values you think your users will query and store them in the database.
    This would work fine actually when you know that your users are going to generate reports on a certain fields often and the performance is crucial. It might actually make sense to calculate certain values beforehand and store them. The disadvantage of this is additional storage and that you need to keep the values updated – the calculated field depends on other fields and their values can change. This would imply re-calculating the report field often as a part of the daily or weekly routine depending on the workflow.
  1. Get the feature’s data from the ArcGIS Server feature service and calculate the requested value on-the-fly in the client.
    Unless you are retrieving complex geometry, this operation wouldn’t cost you much. The problem is that the volume of JS code (or TypeScript) will increase and every upcoming modification in the code would imply new release which can be a painful process if you need to compress your code and move things around. Another thing is that if the amount of data you work with is rather large, there is a good chance the web browser might get slow and the performance will degrade significantly.
  1. Use the database server to calculate the values.
    This became my favorite over last years. This approach has multiple advantages.
    First, this operation runs on the database server machine with enough RAM and CPU resources. So you are not limited by the web browser capacity. The database servers are very good at calculating the values: this kind of operation is very inexpensive because in most cases it does not involve use of cursors. You have a privilege to work in transaction which provides a higher level of data integrity (it would be hard to mess up the database since you can roll back).
    Second, you can use SQL. It might not sound as an advantage first, but remember that code is written once, but is read many times. Readability counts. SQL is a clean way of communicating the workflow and the database code (such as stored procedures) is very easy to maintain. Unlike JS, you work with just one database object and don’t really have any dependencies on the system provided that you have a database server of a certain version and privileges required to create and execute stored procedures.
    Finally, allowing the database server do the work for you, you expose a certain procedure to other clients which could work with it. You don’t need to modify the client code and by updating the SQL code at one place, you automatically make it available for all the applications that work with it.

ArcREST: Python package for administering ArcGIS Server and ArcGIS Online/Portal

ArcREST is a great toolset I have found some time ago. It is for anyone who administers ArcGIS Online, ArcGIS Portal or ArcGIS Server. In short, it is a Python wrapper for the Esri REST API. I had to write many Python scripts that allowed me to update the properties of ArcGIS Server services in batch, but now I don’t need to write anything like this anymore. This is because now I can do everything I did on my own just by using ArcREST. If you are an ArcGIS Online / Portal admin, you should definitely take a look at this module since it can save you a lot of time, and you won’t need to author your own scripts for managing the ArcGIS Online content and organization settings with the scripting techniques.

This Python package is authored by Esri Solutions team and is available in public access on GitHub. You can download the source code, optionally install the package, and then use it on your local machine just like any Python package. If you don’t want to install the package, you can just add the path to arcrest and arcresthelper folders to the Python path by adding this into your Python file:

import sys
sys.path.append(r”path to arcrest folder”) #C:\GIS\Tools

Provided that you have a folder named arcrest in the example Tools folder, when you run the Python file, it will be able to import the arcrest package and access its modules.

To get an overview of this Python package, take a look at this excellent DevSummit 2015 video where developers of ArcREST talked about it.

Even though this is not a full implementation of the Esri REST API, it covers most of it and Esri developers update the code to include latest changes in the REST API. It is a good idea to clone the repository and pull the changes now and then to get the latest code if you will use on the daily basis.

I felt kind of sad first that all the Python code I wrote for administering ArcGIS Server won’t be used any longer, but at the same time so glad the ArcREST was developed. It is a great piece of software that will let you get started in no time at all and access all your server/online resources with Python.

Caveat: it does have some dependencies on arcpy package which is used for converting feature sets into JSON and back, but apart from that you should be able to run the tools on a machine with no ArcGIS software installed whatsoever.

Publishing Python scripts as geoprocessing services: best practices

Why Python instead of models?

If you have been publishing your ModelBuilder models as geoprocessing (further GP) services you have probably realized that it can be quite cumbersome. If you haven’t moved to Python, I think you really should. Authoring Python scripts has serious advantages over authoring models in the context of publishing GP services. This is because during the publishing process, ArcGIS Server will turn data and anything that may be needed to change into variables and this might mess up the model if you haven’t followed the guidelines on authoring GP services. The rule of thumb for me was that if there are more than 10 objects in the model, it is a good time to switch to Python. Another thing is that you can easily make modifications in the Python code without republishing; in contrast, you need to republish the model each time you want to release an updated version of the GP service. Finally, since you don’t need to restart the GP service when updating the Python file (in contrast to republishing the model which requires restarting service), there is no down-time for the service and users won’t notice anything.

What happens after publishing?

Let’s take a look at what is going on under the hood. You have run your script tool in ArcMap and got the result published as a service. Now you can find your service and all the accompanying data inside the arcgisserver folder somewhere on your disk drive. The path would be: C:\arcgisserver\directories\arcgissystem\arcgisinput\%GPServiceName%.GPServer

You will find a bunch of files within the folder. Let’s inspect some of them:

  • serviceconfiguration.json – provides an overview over all the properties of the service including its execution type, enabled capabilities, output directory and many others. Here you will see all the settings you usually see in the Service Editor window.
  • manifest.xml and manifest.json – provides an overview of the system settings that were used while publishing the service. Those are not the files you usually would want to inspect.

Inside the folder esriinfo/metadata there is a file named metadata.xml which is really helpful because there you can see what date a service was published. Two tags you should look at are:

  • <CreaDate>20141204</CreaDate>
  • <CreaTime>15443700</CreaTime>

Since this information is not exposed from the GUI in ArcGIS Desktop or ArcGIS Server Manager, this is the only way to find out what time the service was created. This information may be very handy when you are unsure about the release versions.

Inside the extracted/v101 folder, you will find the result file and the toolbox you have worked with when publishing the GP service. Here you will also find a folder named after the folder where you source Python file was stored and containing the source Python file.

Best practices to organize the Python code and files?

Let’s look inside the Python file. You might have noticed that when publishing some of the variables you’ve declared were renamed to g_ESRI_variable_%id%. The rule of a thumb is that you shouldn’t really use strings; you can turn paths to datasets and names into variables. Of course you don’t have to do this since Esri will update those inline variables, but it is so much harder to refactor with those variable names, so you better organize your code correctly from the beginning.

If running the script tool in ArcGIS, scratch geodatabase is located at C:\Users\%user%\AppData\Local\Temp\scratch.gdb. However, after publishing the tool, the service will get a new scratch geodatabase. If you need to inspect the intermediate data created, go to the scratch geodatabase (the path can be retrieved with the arcpy.env.scratchGDB) which will be a new file geodatabase in each run of GP service with the following notation: c:\arcgisserver\directories\arcgisjobs\%service%_gpserver\%jobid%\scratch\scratch.gdb.

Keep in mind that GP service will always use its local server jobs folder for writing intermediate data and this behavior cannot be changed. But having the service writing to the scratch workspace is actually a lot safer than writing to a designated location on disk. This is because there is no chance of multiple GP service instances trying to write to the same location at the same time which can result in dead-locking and concurrency issues. Remember that each submitted GP job will be assigned to a new unique scratch folder and geodatabase in the arcgisjobs folder.

Make sure you don’t use arcpy.env.workspace in your code; always declare a path variable and assign it to be the folder or a geodatabase connection. For the datasets path, use the os.path.join() instead of concatenating strings. For performance reasons, use in_memory workspace for intermediate data with the following notation:

var = os.path.join(in_memory,"FeatureClassName")

You can take advantage of using in_memory workspace, but for troubleshooting purposes it might be better to write something to disk to inspect later on. In this case, it might be handy to create a variable called something like gpTempPlace which you can change to be either “in_memory” or a local file geodatabase depending whether you run clean code in production or troubleshoot the service in the staging environment.

Make sure you don’t use the same name for variable and feature class/field name. This sometimes leads to unexpected results when running the service. It might be helpful to add “_fc” in the end for feature class variable and “_field” for the field variable. This way, you will also be able to distinguish them much easier. The same is applicable for the feature class and feature layer (created with Make Feature Layer GP tool) names.

Remember that you can adjust the Logging level of ArcGIS Server (in Manger) or GP service only (done in Service Editor window) for troubleshooting purposes. It is often useful to set the Message Level setting to Info level before publishing the GP service into production because this will give you the detailed information what exactly went wrong when running the GP service. You can access this information either from the Results window in ArcMap or from the ArcGIS Server logs in Manager.

How do I update the published service?

A special word should go for those users who needs to publish the GP services often while making changes in the code. It is important to understand that after publishing the GP service, the copied Python code file and the toolbox don’t maintain any connection to the source Python file and the source toolbox you have authored. This implies that after making edits to the script tool, you need to push those changes to the published service on server.

There are two types of changes you can make on your source project: the tool parameters in the script tool properties and the Python code. Keep in mind that you cannot edit the published toolbox; so if you added a new parameter or modified existing parameter data type, you would need to republish the service. However, if have only modified the Python script source code, there is no need to republish the whole service as you only need to replace the contents within the Python file.

Even though you can automate service publishing workflow with Python, it still takes time to move the toolbox and the Python code files. Therefore, you can save a lot of time by finding a way to replace the Python code in the published Python code file. In order to update the Python code file, you have really just two options – you either copy and replace the published file with the updated source file or copy/paste the source code. This approach may work if you have a tiny Python script and all the paths to the data on your machine and on the server are the same. This can be a plausible solution when you have Desktop and Server on the same machine. If you have a configuration file where the Python source code file gets all the paths and dataset names from, you could also safely replace the published Python file. However, this is still an extra thing to do.

The best practice I came to while working for two years on the GP services is to split the code files and the tool itself. Let me explain.

How do I separate the toolbox logic and Python code?

Create a Python file (a caller script) which will contain all the import statements for the Python modules and your files. By appending the path to your Python files, you will be able import the Python files you are working on; this is very useful when your GP service consists not just of one file yet of multiple modules.

import sys
import socket
sys.path.append(r'\\' + socket.gethostname() + "path to code files")
import codefile1 #your Python file with business logic
import codefile2 #your Python file with business logic

This file should also include all the parameters which are exposed in the script tool.

Param1 = arcpy.GetParameterAsText(0)
Param2 = arcpy.GetParameterAsText(1)
Param3 = arcpy.GetParameterAsText(2)

Then you define a main function which will be executed when running the GP service. You call the functions defined within the files you imported.

def mainworkflow(Param1,Param2):
    Result = codefile1.functionName(Param1,Param2)
return Result

It is also handy to add some parameter handling logic; when you run the script tool in ArcMap, you supply some values for the tool which will become default values visible when users will execute GP service from ArcMap or from any other custom interface. In order to avoid that, you can just leave those parameters empty and then return empty output for GP script tool publishing only purposes.

if Param1 == '' and Param2 == "":
    Result = ""
    Result = mainworkflow(Param1,Param2)

Create a script tool from this caller Python file defining the parameters and their data types. After the tool will be published as a GP service, you can work with the Python files which will contain only the code that actually does the job.

After performing and saving the changes in the code, feel free to run the GP service directly – the caller Python file (published as a GP service) will import the codefile1 at the folder you specified and run the code. There is no need to restart the GP service or re-import your update module.