Using conda for geospatial Python development

In the summer, I have got a chance to learn more about various Python packages that are useful  to anyone working with geospatial data or GIS. ArcGIS is no different in this respect, as all the GIS data (such as file geodatabases) accessed with ArcGIS Desktop can be converted to open formats (such as GeoJSON or shapefiles) and used outside of the ArcGIS ecosystem.

To make managing Python packages easier, the whole new framework called Conda has been released. It is available for download for free and it is a great way to manage not only your Python packages (what you have probably done earlier using pip), but also your Python environments (what you have probably done earlier using virtualenv or just by re-installing the ArcGIS because you broke something in the Python base). Let me elaborate on that.

So, you have a fresh installation of ArcGIS Desktop 10.4 which has a Python 2.7.10 set up on your system. If you would like to have a 3rd party Python package installed, such as networkx, you would install it into the Python installation that is used by ArcGIS Desktop. Now you can access arcpy and networkx from your favorite Python IDE. But what if you need to install a Python package that depends on numpy of a certain version, different from the one that arcpy depends on? This means you cannot have them installed side by side in one Python installation. A framework called virtualenv solved this problem by isolating installed packages into separate environments, but you still had to manage your Python installations on your own.

Conda provides similar functionality making it possible to create isolated environments that include both Python installation and packages. Because you control the Python installation, you can install multiple versions of Python (such as 2.7 and 3.5) on the same system in Conda and then switch between them as needed. This can be helpful not only for users who want to use existing Python packages that may require different environments, but also for package developers who can easily test how their packages behave being installed in various environments.

I wanted to test Conda to see how easily GIS related packages can be installed in a fresh Python 2.7.10 environment. This is the workflow I’ve gone through:

1. Download and install Anaconda (which contains Conda manager + many Python packages included). As I have ArcGIS installed on the same machine, I unchecked the check boxes (1) make Anaconda the default Python and (2) add Anaconda’s Python to the Windows PATH.
2. To learn how to get started using Anaconda, go through this 30 minutes test drive.
3. Keep open conda cheetsheat.
4. I created a Python 2.7.10 environment with the following packages:

  • fiona
  • geopy
  • descartes
  • shapely
  • cartopy
  • pysal
  • pyproj
  • basemap
  • geopandas
  • bokeh
  • vincent
  • Folium
  • Scipy
  • networkx

4.1 Create environment:

conda create -n spatialbox python=2.7.10

4.2. Install fiona:

conda install fiona

4.3. Install packages listed in a text file:

conda install -c conda-forge –file custom_packages.txt

 

custom_packages.txt contents:

geopy
descartes
shapely
cartopy
pysal
pyproj
basemap
bokeh
Folium
seaborn
networkx
Ipykernel

5. Lastly, I run

conda install -c conda-forge geopandas

After you have installed these packages, you would likely want to test all of them have been installed properly. I have shared a Jupyter notebook Python code that contains primitive calls to each of these packages so you can see that they work well.

This suite contains practically everything you might need dealing with geospatial data. As there is a lot to cover with respect to every individual package, I decided to have separate posts on each of them.

Now say you would like to use arcpy package along with the geopy package. As these two packages have been installed in separate Python environments, you have to tell the Python installations where to look for packages in addition to the core site-packages folder. This is where instructions written by Curtis Price on how to get to arcpy from anaconda and backwards are very helpful. Please refer to the step 3 of this document. Curtis’ contribution on this topic has been invaluable to me. Another blog post will also guide on installation of Anaconda and the environments.

Now you can create a Python project in your Python IDE and then specify that you want to use your newly created Python environment with all the geospatial packages as your Python executable. Because you have specified in the .pth file of your environment that Python has to look inside site-packages of the ArcGIS Python folder, you will be able to import arcpy.

If installing separate Anaconda sounds too much for you, you can consider another alternative. ArcGIS Pro 1.3 installation comes with Conda which means you will be able to use Conda directly after installing ArcGIS Pro. Learn more about Conda in ArcGIS Pro. There is a great video from the Dev Summit on how Conda works within ArcGIS Pro installation. The talk’s pdf is available here.

Just in case you are an R user, you can install R and its packages with conda.

Test using conda and I am sure you will love it.

Advertisements

One thought on “Using conda for geospatial Python development

  1. Great post!

    I’m getting an error entering ‘conda install -c conda-forge –file custom_packages.txt’

    Is the text file in some sort of repository? Can I install these modules separately?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s