ESRI ArcGIS 9.3 version has support for one more database management system (DBMS) for storing our spatial datasets. In addition to the ability of using Microsoft SQL Server, Oracle, IBM DB2 and Informix, PostgreSQL can be used now as well. In this post, I would like to give a brief review of this DBMS and outline some results of its integration with ESRI ArcGIS family products.
PostgreSQL is an open source DBMS and it was developed by an online community. This product is licensed under BSD license (which makes it basically free and moves very close to public domain use). This DBMS conforms to SQL 92/99 standards (see details for database language grammar). PostgreSQL can be treated as an adequate alternative for most used commercial DBMSs; it provides support for user defined data types (UDT), indexing, using stored procedures (OK, there are no really stored procedures there, but kind of), views, indices and many other things we used to use in everyday work with databases. Furthermore, the DBMS has client library interface in several programming languages including, but not limited to, C++, Java, and Python.
When I am talking about using a DBMS with ESRI GIS solutions, I mean mostly using a DBMS for managing geographical information and having an interface that could provide an effective means for data storing, indexing, and retrieving. As to ESRI solutions, ArcSDE provides such a gateway to a DBMS and is used for loading and extracting spatial data from DBMS. Thus, multiple questions are raised. Is it easy to install and administer the DBMS software? Does it work relatively fast comparing to other commercial DBMS? How good is the product documentation? Well, let’s try to answer at least some of those questions.
Requirements and installation of PostgreSQL
The full list of requirements is provided at ESRI Wiki.
The process of installation is very simple, straightforward, wizard-based, and well-documented. A detailed step-by-step guide is here, at the ESRI support knowledge base article #35128.
Most of these steps are given in the ArcGIS Server Help as well.
PostgreSQL distribution files are shipped with ArcSDE binary files, thus no any additional steps for downloading any PostgreSQL files are required. Instructions on PostgreSQL and ArcSDE configuration are shipped with binary files at <ArcSDE installation folder>\windows\documentation_server\ARCSDE93POSTGRESQLINSTALLGUIDES\install _gd_postgresql.htm.
Using a spatial database
After performing installation, post-installation, and PostgreSQL configuration, one is ready to use ArcCatalog application or PostgreSQL administrative application (pgAdmin III) in order to start creating or loading data into a spatial database. In general, not many significant differences can be noticed while working with SDE database stored in PostgreSQL as to Oracle or SQL Server. Both three-tier (sde service) and two-tier architecture (direct connection) can be used to establish a connection with a database (however, a direct connection has been the recommended way to go for a while by ESRI).
Using PostGIS extension
PostGIS is an optional spatial database extension for PostgreSQL that is not installed during PostgreSQL installation and should be installed and enabled independently via Application Stack Builder. PostGIS provides support for geographic objects storage into the PostgreSQL object-relational database. Thus, PostGIS, basically, “spatially enables” the PostgreSQL to be used as a spatial database for ESRI ArcSDE platform by using a PG_GEOMETRY format. Similar solutions are provided by Oracle (Oracle Spatial), IBM DB2 (Spatial Extender), Informix (Spatial DataBlade Module), and Microsoft SQL Server 2008 (out-of-the-box GEOMETRY and GEOGRAPHY data types). PostGIS complies with the OpenGIS “Simple Features Specification for SQL” and has been certified as compliant with the “Types and Functions“. PostGIS installation can be found and downloaded from the Internet and is free as well. For details, see ESRI Support knowledge base article 36389.
In a word, what PostGIS gives us, is that it makes it possible to store geometry of features by using another format, PG_GEOMETRY. Nevertheless, we can use spatial SQL functions with ST_GEOMETRY as well as PG_GEOMETRY types to manipulate features from DBMS to any other application, much like how we can do that in Oracle. The list of available SQL functions can be found here: http://webhelp.esri.com/arcgisserver/9.3.1/dotNet/geodatabases/an_overv315896832.htm. That is, users are able to perform SQL queries and operations on spatial data directly within the PostgreSQL.
Inside the PostgreSQL, the interface is very simple and intuitive, looking a bit similar to SQL Server Management Studio. Product documentation is very good and the help system is available both from the desktop installation and from the web (there is a help with user comments as well)
Regarding performance of data retrieving, just a few tests have been done. Undoubtedly, the level of performance will vary greatly among different machines depending upon hardware resources, spatial datasets complexity, and finally, performance assessment tools. In our case, the assessment of the time required for retrieving and drawing features from a PostgreSQL database of PostGIS spatial type and SQL Server 2008 Geometry type has been done. For this purpose, an ESRI mxdperfstat utility has been used.
To my knowledge, this free utility is the most appropriate ArcGIS-compatible solution for map rendering performance evaluation (apart from Geodatabase Toolset (GDBT) for ArcCatalog, which will be available soon). This application from ESRI Testing Service is widely used to help diagnose typical performance problems with ArcGIS map documents such as:
- Inefficient scale dependency options use
- Slow symbology rendering
- Using projection on-the-fly
- Potential database tuning needs.
Two MXD map documents were created and each of them included the same set of feature classes from two different databases, from a PostgreSQL 8.3 of PostGIS data type and SQL Server 2008 (PG_GEOMETRY and Geometry data types, respectively). As to my tests, the speed of retrieving and drawing features from those two databases were different. I have used a direct connection to both PostgreSQL database and SQL Server 2008 database.
SQL Server feature classes were retrieved and drawn faster on smaller scales (from 1:1,000,000 and smaller). However, on large scales, 1:500,000 and larger, feature classes with PostGIS geometry are retrieved significantly faster. As I did not set any scale dependencies, i.e., in smaller scales more features are drawn, since we cover a bigger geographical area. The problem could be with the indices in SQL Server 2008, which are recommended to be changed by ESRI. In order to understand whether the tuning of ArcSDE configuration parameters, PostgreSQL database tuning, and SQL Server 2008 indices tuning can influence the performance of PostgreSQL, more detailed research should be done.
Conclusion and future prospects
With the version ArcGIS Server 9.3, users are now provided with additional support for PostgreSQL and PostGIS framework. This allows users to access multiple types of geographic datasets from virtually any commercial or free software programs (you could check zigGIS if you have PostGIS data and ArcGIS Desktop, but ArcSDE is not accessible). The ability of using PostgreSQL can be even more alluring given that in the ArcGIS 9.3 Data Interoperability extension direct read, data import and data export options for PostgreSQL and PostGIS were added. Finally, as proposed, at ArcGIS 10, ESRI will release an open API for the geodatabase, thus further enabling non-ESRI software to interoperate with the ArcGIS platform smoothly. In the release 10, which seems to be very promising, ArcGIS will introduce a new functionality that is called “query layers”. This option will allow users to access data of different spatial types stored in a database that is not an ESRI geodatabases directly. That will be done in order to address many ESRI users’ needs to support direct access to the spatial types within a relational database (i.e. SQL 2008, Oracle Spatial, etc) that are not geodatabases per se (i.e. does not utilize ESRI ArcSDE technology). As ESRI states, these so called “query layers” “will allow users to use SQL queries to access spatial type data and create a new (read-only) layer in ArcMap. This layer will allow users to map, query, and analyze data from spatially enabled databases.” Users will be able to use PostgreSQL without registering the spatial information in an ESRI geodatabase or installing and configuring ArcSDE. Moreover, geodatabase users who store their spatial information in spatial types can use this functionality to work with their data using complex SQL queries. All of those arguments make it worthwhile to take a closer look at PostgreSQL and its possible applications to ESRI solutions market.
Some blog posts worth checking out