Geospatial careers: list of companies

I have published a curated list of companies in geospatial industry which you can use when looking for a job. The list is sorted by country and with some extent by industry domain. I have compiled it over many years while I was a part of GIS industry.

https://github.com/AlexArcPy/geospatial-careers

Each company is documented in format of [company website]: very brief description of what they do. Obviously quite a few companies work with a variety things so they can be hard to categorize. A company may have jobs for GIS developers, digital mappers, photogrammetry experts, machine learning professionals, and so forth. I suggest exploring their home page to learn more about them. It is also possible that a company did just one thing when this list was published, but then they expanded and now advertize for other job titles, too.

It is also possible that a company is located under a certain country section, but may have offices in other countries, too. However, keep in mind that they may look for geospatial positions only in a particular country/office. Company description is not comprehensive: a company may do many other things apart from geospatial related operations, but I won’t mention them as they are irrelevant in this context. Some companies may permit working remotely. Again, please explore the company website to double-check. Due to the dynamic nature of the Internet, if the URL is broken, just use a web search engine to find a company’s website.

This page won’t be updated on a regular basis so it is pretty static. If you know a company in geospatial sector, by all means, please do submit a pull request so we could expand this list. The geospatial industry is fairly small so I thought sharing this list with the community would benefit both the companies looking for talent and peer professionals looking for a job.

Good luck with job hunting!

Advertisements
Featured

Leaving the GIS industry: bye for now

After working for more than 10 years in geospatial industry, I have decided to change the field and focus on pure software engineering. I have quite enjoyed using GIS tools to solve various computational problems and software development tools to build useful GIS products and services for my customers. However, as my interests starting shifting towards clean code and programming per se – I have noticed I was reading in the bed Refactoring by Martin Fowler way more often than The ESRI guide to GIS analysis by Andy Mitchell – so I thought it would be useful to try changing the career path.

As I won’t be using any of the GIS software any longer, I won’t be able to post any new practical material that would be useful to GIS analysts and developers. However, in this post I would like to share some of the last thoughts I have which could be of interest to peer GIS professionals.

This blog becomes an archive of hopefully useful resources. Good luck!

Skills

SQL

I cannot stress enough how important it is for anyone using GIS data to master SQL. It is such an integral part of nearly any GIS processing workflow and being able to process or manage data stored in a DBMS using SQL becomes crucial. As GIS datasets grow in size, writing scripts for data processing won’t suffice as there are still no efficient spatial data processing packages that could make the process consistent, fast, and reliable. Using PostGIS or SQL Server native spatial types can get you often farther than any open source Python package. Don’t stop mastering SQL after learning the basics as there is so much more.

I have two massive posts about using SQL functions for GIS:

Math

Don’t bother too much studying math. I am not sure why many people I have spoken to think that in order to be good at GIS one has to be good at math. Being able to operate high school algebra and geometry terms is definitely useful (angle between intersecting roads, area of a lake, percentage of parcel covered by forest), but there is no need to study more sophisticated math. For a casual GIS analyst work, having the basics covered will be more than enough.

Of course if you are building a new routing algorithm that will be more efficient that A*, then you would have to review a lot of graph theory materials or if you are building a new Voronoi diagram generation procedure, you will find yourself reading the computational geometry books. If you are doing a lot of GIS analysis with spatial statistics stuff, then you should definitely get this amazing book Elementary Statistics for Geographers Third Edition.

If you do a lot of computational intensive programming and would like to catch up on math specifically for GIS, review

Testing in GIS

Review these two posts:

Python

Learn Python. It is the most widely used programming language in GIS industry and its usage will only expand. I have written the A progression path for GIS analyst which could be used a development road map. You should be very comfortable using Python; having Python skills will let you have a stronger influence on any operations within the organization and potentially automate more manual workflows leading to a better workplace.

Linux and bash

Learn Linux and bash. I think I should have started using Linux earlier. There are a few (1, 2) ready to use VirtualBox images with a ton of open-source GIS software installed, configured and ready-to-use. Using those machines will save you a lot of time. Learning bash is extremely helpful because it would let you be much more productive executing smaller commands and building pipelines for data processing than you would normally do on Windows using a programming language. Obviously learning bash, Linux, and Python are part of the industry agnostic skill set you could benefit from having at any later point of time.

Readings

There are so many excellent GIS books that I would like to recommend. You can find most popular titles online. What I’d like to do instead is to share of the hidden gems I have discovered and have really enjoyed reviewing. You can find those in the post Useful resources in computer science/math for GIS Analysts.

Programming

Ad-hoc mentality is very difficult to fight. It is 7 pm. You have a job you have to get done before releasing the data to a customer tomorrow morning. You are adding a missing domain to a geodatabase that your colleague’s Python script failed to add. Then you are changing a data type for a field because you have just realized that you need to store text instead of numbers. And… you find a few other things you are so glad to have spotted before the release. You fix them, zip the database, and upload it onto an FTP site. It is 10 pm, you are tired but happy and are heading off to the bed.

Success! … Or is it? The next thing tomorrow morning you want to document the manual changes you’ve introduced yesterday, but you are being dragged into some other urgent job… and you never do. A week after, a customer sends an email telling you she’s not able to run their in-house tools using your database you’ve prepared for them, but the one you’ve prepared a month ago works. Now it is 9 pm again and you are writing some oddly looking script trying to compare the databases and recalling what have you done on that evening… You are in a mess.

Doing what you have done may look natural because you just want to get stuff done. However, I want you to look at this from another perspective. You want your steps to be reproducible. You want to be able to track the changes you have done. Not only you, but any colleague of yours should be able to pick up the updates that have been made to any piece of data or a script. So resist the urge to get stuff done, pace yourself, and track your work with one of the following methods.

Documenting manually

If you are not comfortable programming or scripting at all, you should document each step you are taking while making modifications to a dataset. At least you could see what has been done in written form. I cannot stress that enough – you should document what you are doing, not what you have done. So write down what you have done after each change operation, not after you have done all the work. This is how it can look:

  1. You added field Area of Double type to the table dbo.Parcels.
  2. You write: “Add field Area of Double type to the table dbo.Parcels.”
  3. You dropped field OldArea of Double type in the table dbo.Parcels.
  4. You write: “Drop field OldArea of Double type in the table dbo.Parcels.”

One of the disadvantages of this approach is that it is possible to get the changes out of sync with the documentation. You could have made an error documenting a data type or a field name. Another thing is that the very same step can be done in many ways – what if you add field to a database using some GIS application and a colleague of yours uses a DBMS command line tool? Documenting exactly the procedure of making changes soon becomes tedious and you end up with tons of instructions that easily becomes misleading or plain obsolete. However, if you are vigorous, it is still possible to maintain a decent level of changes tracking with sufficiently rigid discipline.

Simply programming with VCS

Another approach is to write a program that will make the changes. When you write code, you don’t need to document what you are doing because the reader familiar with the syntax of this programming language will understand what happens. You can add some comments of course explaining why adding certain fields is required though. So, if you are building a database with a few tables, you can write a SQL script that can be re-run recreating your database at any point of time. If you never make any manual changes to a database and only write and keep SQL commands, your main SQL data compilation script will never get out of sync.

This leads us to a concept of version tracking where it is possible to track how your SQL script changed since the last version. Who is not guilty of having at some point of our career a dozen of files with some scripts named “production_final_compilation_truly_final_12.sql“? To avoid this mess, you should really use a VCS.

The main argument against this approach is that it setting up all this version control tools look like an overkill for someone doing simple GIS work. However, you will see how much safer your work will be in the long run. It will pay off very soon. Invest some time in learning about VCS such as Git for managing the source code. All major players – BitBucket, GitLab, and GitHub – provide free private repositories. Find out whether there is a VCS solution deployed in-house within your organization, such as Microsoft TFS, which you could use to check in the code. Should you like to dive deeper into Git, read the Git Pro book for free online. If you are not comfortable putting anything into the cloud (which is just someone’s else computer), use Git locally on your machine or a local server where you can securely check in your code and ask your system administrator to take backups of those repositories.

Open source vs proprietary software

Throughout your GIS career, you most likely will be exposed to both proprietary and open source software. You will have Windows on your machine with QGIS; or a Linux machine with Esri ArcGIS Server. It would be naive to think that either of these technologies is superior to another. You should be able to get the job done whatever tools you have available because you will not always be able to decide what your employer will be using.

I suggest instead being comfortable with both of them and widening your toolset as much as possible. As you become exposed to different tools, you will soon realize that commercial software can be much better for certain jobs rather than open-source or free one. For instance, certain types of spatial joins can run faster in ArcGIS Desktop rather than PostGIS, but some GDAL based raster masking may outperform ArcGIS Spatial Analyst tools. Creating a map layout with data driven pages is a pleasure in ArcMap, but can be tedious in QGIS. Always do the benchmarking to understand what tools work best and document your findings. Keep in mind that the very same tool can take 1 second to process 1,000 features and 5 minutes to process 10,000 features. Review briefly the Big O notation to avoid surprises.

I have always encouraged people using a particular tool to understand what it really does. Having a clear understanding of the underlying process will make it possible for you to extend an existing tool or write your own one. For instance, if you are applying some raster masking, you should understand what a matrix is. If you do a spatial join, you should understand how having a spatial index helps.

Always look to expand your toolset and be prepared to apply a tool that you think would be right for a particular job. A customer only has PostGIS and needs to do some polygon intersection? You use ST_Intersects native function. Don’t have access to QGIS? Know which ArcToolbox tool does the same job. You have to process a file on Linux machine you SSH into (so no Excel-like software)? Use bash or pandas to wrangle the data as needed. You shouldn’t be constrained by the environment you are in and what tools you have available at your disposal. You should be able to get the job done no matter what.

Keeping up with the industry

I have been a user of GIS StackExchange since 2013 and have blogged about my experience and why is it useful to be active on a forum in the post 4 years with GIS at StackExchange. Make a habit of reading the weekly most popular questions, for instance, every weekend. If you see a question you know the answer to, post it. It also helps to ask a question you had yourself and then you spent a week solving it and then finally found a solution. Please post an answer to your own question. You will save some effort to a peer GIS professional and you can also find this answer later when you will be doing a web search for the same issue in a few years time. If you have some time, you can review the most popular questions using the most voted questions option; there is so much to learn there.

Edit files in a mounted Linux directory in Windows

Sometimes it is very useful to be able to edit files stored on a Linux machine in a Windows application. This can be a handy setup when you want to store your source code on Linux to be able to execute it against a Linux Python interpreter but you would like to edit it in a rich GUI application such as PyCharm or Eclipse. To achieve this, you can use an open source framework that mounts a Linux directory as a Windows drive from which you can add your files to a PyCharm project.

Another use case is when an application you need to use is available under Windows only, but copying the files from Windows to Linux upon every edit is tedious.

To mount a Linux directory in Windows:

  1. Install https://www.microsoft.com/en-US/download/details.aspx?id=40784 (install x64).

  2. Install https://github.com/dokan-dev/dokany/releases/tag/v0.7.4 (ignore a message saying do you want to download the VS runtime – this is because it installs libraries for x86 you will not need, just click Cancel).

  3. Download and run https://github.com/feo-cz/win-sshfs/releases/tag/1.5.12.8. It will be available in your tray.

  4. Mount a drive as described in the section Using Win-SSHFS to Mount Remote File Systems on Windows at https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh.

You can optionally choose to mount a Linux directory on Windows start up. Extremely handy.

Printing pretty tables with Python in ArcGIS

This post would of interest to ArcGIS users authoring custom Python script tools who need to print out tables in the tool dialog box. You would also benefit from the following information if you need to print out some information in the Python window of ArcMap doing some ad hoc data exploration.

Fairly often your only way to communicate the results of the tool execution is to print out a table that the user could look at. It is possible to create an Excel file using a Python package such as xlsxwriter or by exporting an existing data structure such as a pandas data frame into an Excel or .csv file which user could open. Keep in mind that it is possible to start Excel with the file open using the os.system command:

os.system('start excel.exe {0}'.format(excel_path))

However, if you only need to print out some simple information into a table format within the dialog box of the running tool, you could construct such a table using built-in Python. This is particularly helpful in those cases where you cannot guarantee that the end user will have the 3rd party Python packages installed or where the output table is really small and it is not supposed to be analyzed or processed further.

However, as soon as you would try to build something flexible with the varying column width or when you don’t know beforehand what output columns and what data the table will be printed with, it gets very tedious. You need to manipulate multiple strings and tuples making sure everything draws properly.

In these cases, it is so much nicer to be able to take advantage of the external Python packages where all these concerns have been already taken care of. I have been using the tabulate, but there are a few others such as PrettyTable and texttable both of which will generate a formatted text table using ASCII characters.

To give you a sense of the tabulate package, look at the code necessary to produce a nice table using the ugly formatted strings (the first part) and using the tabulate package (the second part):

The output of the table produced using the built-in modules only:

builtin

The output of the table produced using the tabulate module:

tabulate

 

 

Using Python start up script for all Python interpreters

This post would be helpful for users of desktop GIS software such as ArcMap who need to use Python inside those applications.

There is a not so well known trick to trigger execution of a Python script before any Python interpreter on your system starts.

Note: If you are a QGIS user, there is a special way of achieving this. Please see the question Script that runs automatically from the QGIS Python Console when QGIS starts, using schedule tasks on Windows 10 for details.

The way to do this is to set up an environment variable called PYTHONSTARTUP in your operating system. You need to do two things:

  1. Create an environment variable that would point to a path of a valid Python script file (.py) with the code that you would like to get executed before any Python interactive interpreter starts. Look at the question [Installing pythonstartup file](https://stackoverflow.com/questions/5837259/installing-pythonstartup-file) for details.
  2. Write Python code that you would like to get executed.

A very important thing to consider is that

The file is executed in the same namespace where interactive commands are executed so that objects defined or imported in it can be used without qualification in the interactive session.

This means that you can do a bunch of imports and define multiple variables which would be available to you directly at the start up of your GIS application. This is very handy because I often need to import the os and sys modules as well as import arcpy.mapping module and create mxd variable pointing to the current map document I have open in ArcMap.

Here is the code of my startup Python script which you can modify to suit your needs. If your workflow relies on having some data at hand, then you might need to add more variables exposed. I have ArcMap and ArcGIS Pro users in mind.

I have included in the example above a more specific workflow where you would like to be able to quickly execute SQL queries against an enterprise geodatabase (SDE). So, when ArcMap has started, you only need to create a variable conn pointing to a database connection file (.sde) and then use the sql() function running your query. Thanks to the tabulate package, the list of lists you get back is drawn in a nice table format.

2018-03-15 12_55_03-PythonWindowArcMap

 

 

Generating a database schema report using SchemaSpy

I recently wanted to generate a visual report over a database to see what tables, relationships, and other objects are present. I also wanted to see the schema of all the tables. The registrant Python package I’ve written does give me most of the things I want but it was not designed to fetch any relational structure such as primary/foreign keys and constraints.

I found an excellent free command line tool tool, SchemaSpy. It requires Java so it may be an extra thing to fix if you do not have it installed. The syntax for Microsoft SQL Server:

java.exe -jar .\schemaspy-6.0.0-rc2.jar -t mssql05 -dp .\sqljdbc42.jar -db non_spat -host localhost -port 1433 -u user -p password -o C:\Temp\sqlschema

You would need to download the Microsoft JDBC Driver for SQL Server if you work with MS SQL Server (I used the version 6).

Python progression path for GIS professionals

Over last years, I was working with Python almost full time either scripting some desktop GIS workflows or developing code for the back-end geoprocessing services using arcpy. I learned all kinds of Python packages, everything from data science packages such as pandas and numpy to more widely applicable ones such as xlsxwriter and reportlab. Being able to find a package and start using it producing the outputs needed in a matter of minutes is one of the key selling points of Python, I think.

However, due to the presence of such a large number of resources that are related to Python (just check this repository on GitHub – A curated list of awesome Python frameworks, libraries, software and resources)- one might feel a bit lost. There are so many things to learn, which are the most important ones? It also makes things a bit more complicated for niche developers or GIS analysts who do Python programming just occasionally. I have also experienced frustration being unable to identify the key competence areas to focus on and how to track my progress. Am I learning Python packages that are relevant for geospatial operations? What else should I learn after I’ve managed a certain feature of the language or a framework?

The result of this thought process is a public repository on GitHub which I am working on. It’s called Progression path for a GIS analyst who wants to become proficient in using Python for GIS: from apprentice to guru which is inspired partially by the awesome-python and partially by a SO post Python progression path – From apprentice to guru.

This is an attempt to provide a structured collection of resources that could help a GIS professional to learn how to use Python when working with spatial data management, mapping, and analysis. The resources are organized by progress category so basically everyone should be able to learn something new along the way. The resources will include books, web pages and blog posts, online courses, videos, Q/A from GIS.SE, links to code snippets, and some bedtime readings.

Be sure to check this one out, pick a topic of interest and start working on it. Also, feel free to star the repository if you have a GitHub account 🙂