By: GIS Geography · Last Updated: February 9, 2020
Python Libraries for GIS and Mapping
Python libraries are the ultimate extension in GIS because it allows you to boost its core functionality.
By using Python libraries, you can break out of the mould that is GIS and dive into some serious data science.
There are 200+ standard libraries in Python. But there are thousands of third-party libraries too. So, it’s endless how far you can take it.
Today, it’s all about Python libraries in GIS. Specifically, what are the most popular Python packages that GIS professionals use today? Let’s get started.
First, why even use Python libraries for GIS?
Have you ever noticed how GIS is missing that one capability you need it to do? Because no GIS software can do it all, Python libraries can add that extra functionality you need.
Put simply, a Python library is code someone else has written to make life easier for the rest of us. Developers have written open libraries for machine learning, reporting, graphing and almost everything in Python.
If you want this extra functionality, you can leverage those libraries by importing them in your Python script. From here, you can call functions that aren’t natively part of your core GIS software.
PRO TIP: Use pip to install and manage your packages in Python
Python Libraries for GIS
If you’re going to build an all-star team for GIS Python libraries, this would be it. They all help you go beyond the typical managing, analyzing and visualizing of spatial data. That is the true definition of a geographic information system.
If you use Esri ArcGIS, then you’re probably familiar with the ArcPy library. ArcPy is meant for geoprocessing operations. But it’s not only for spatial analysis, it’s also for data conversion, management and map production with Esri ArcGIS.
Geopandas is like pandas meet GIS. But instead of straight-forward tabular analysis, the geopandas library adds a geographic component. For overlay operations, geopandas uses Fiona and Shapely, which are Python libraries of their own.
The GDAL/OGR library is used for translating between GIS formats and extensions. QGIS, ArcGIS, ERDAS, ENVI and GRASS GIS and almost all GIS software use it for translation in some way. At this time, GDAL/OGR supports 97 vector and 162 raster drivers.
The RSGISLib library is a set of remote sensing tools for raster processing and analysis. To name a few, it classifies, filters and performs statistics on imagery. My personal favorite is the module for object-based segmentation and classification (GEOBIA).
The main purpose of the PyProj library is how it works with spatial referencing systems. It can project and transform coordinates with a range of geographic reference systems. PyProj can also perform geodetic calculations and distances for any given datum.
Python Libraries for Data Science
Data science extracts insights from data. It takes data and tries to make sense of it, such as by plotting it graphically or using machine learning. This list of Python libraries can do exactly this for you.
Numerical Python (NumPy library) takes your attribute table and puts it in a structured array. Once it’s in a structured array, it’s much faster for any scientific computing. One of the best things about it is how you can work with other Python libraries like SciPy for heavy statistical operations.
The Pandas library is immensely popular for data wrangling. It’s not only for statisticians. But it’s incredibly useful in GIS too. Computational performance is key for pandas. The success of Pandas lies in its data frame. Data frames are optimized to work with big data. They’re optimized to such a point that it’s something that Microsoft Excel wouldn’t even be able to handle.
When you’re working with thousands of data points, sometimes the best thing to do is plot it all out. Enter matplotlib. Statisticians use the matplotlib library for visual display. Matplotlib does it all. It plots graphs, charts and maps. Even with big data, it’s decent at crunching numbers.
Lately, machine learning has been all the buzz. And with good reason. Scikit is a Python library that enables machine learning. It’s built in NumPy, SciPy and matplotlib. So, if you want to do any data mining, classification or ML prediction, the Scikit library is a decent choice.
10 Re (regular expressions)
Regular expressions (Re) are the ultimate filtering tool. When there’s a specific string you want to hunt down in a table, this is your go-to library. But you can take it a bit further like detecting, extracting and replacing with pattern matching.
ReportLab is one of the most satisfying libraries in this list. I say this because GIS often lacks sufficient reporting capabilities. Especially, if you want to create a report template, this is a fabulous option. I don’t know why the ReportLab library falls a bit off the radar because it shouldn’t.
PRO TIP: If you need a quick and dirty list of functions for Python libraries, check out DataCamp’s Cheat Sheets.
The Python Libraries All-Star Team
These are the Python libraries we thought were stand-outs for GIS and data science.
Now, it’s time to turn it to you.
If you could build an all-star team of Python libraries, who would you put on your team?
Please let us know with a comment below.