Recently we announced our BigQuery Tiler allowing geospatial Big Data to be visualized in a matter of minutes. Since then we’ve been amazed by the maps being created by the community and those who have signed up for the beta including by our new partners Thinking Machines.
In light of the pandemic, it is very important for us to have a quick and scalable way of identifying vulnerable populations for pinpointed interventions. One example of this is how important it is to make sure that the majority of the population has access to the necessary health facilities that they can go to should they need it. Now, more than ever, government and health groups need to be able to quickly identify where to allocate resources in a way that allows as much impact as possible.
Luckily, through the excellent work of the scientific and open-source geospatial community, we have global datasets that help inform this decision making. We have global datasets on a lot of indicators such as population, health facilities, and mobility, all of which helps in being able to quickly pinpoint these areas – even down to the house-level granularity.
However, with large datasets comes an even larger problem – computing power. With you having to process gigabytes of information to access these datasets, every single processing step becomes a blocker – something as simple as viewing data on a map becomes a huge endeavor. And with huge processing tasks come huge resource requirements of renting out the compute capacity. Thus, for groups without the technical know-how around geospatial processing options or without the resources, it becomes difficult and/or costly to do very granular countrywide geospatial analysis.
Understanding this problem, CARTO developed their BigQuery Tiler – a quick and easy tool to process, visualize and, thereafter, analyze, large spatial datasets straight from BigQuery. Using this technology together with Thinking Machines’ datasets and geospatial processing expertise, we created a demo of how we can quickly identify healthcare gaps at scale.
Going back to the problem earlier – how do we identify high impact locations for the construction of new health facilities. We use two very popular datasets as a proof-of-concept:
Our goal is to be able to identify high concentrations of settlements that do not have access to health facilities within a certain distance. For the purpose of this blog post, we focus on the Philippines, Malaysia and Vietnam – a total of almost 1 million square kilometers in terms of area. The population layer alone has around 19.6M rows in its dataset, and with the health facilities being a bit over a million points of interest in total.
Using BigQuery Tiler, we’re able to load both datasets onto a map in almost no time at all, without having to worry about any ETL, loading times or cost!
Basically, BigQuery Tiler allows us to partition our very large datasets in BigQuery into vector tiles, which makes loading and visualization of datasets much more manageable for our web maps. What this means is that we can easily view the population and health facility data of an entire country without having to worry about the dataset size or scale.
Furthermore, once it’s on the map, we can also easily build analysis layers on top. For example, we can filter out settlements that already have access to health facilities. This allows users to focus primarily on areas that are not within a certain distance to a health facility – a distance that can be easily chosen by the user.
We can even quantify the vulnerable population within an area by using our drawing tools to select custom areas of interest, and easily summarize the data based on that.
The good thing about this as well is that we can generalize the same methodology across many different types of use cases and datasets. For example, at Thinking Machines, we use Machine Learning and AI to extract wealth information from satellite images at scale. We’re able to combine our extracted wealth information with building infrastructure to allow our telecommunications partners to identify ideal locations for cell sites based on their target wealth profiles and potential customer volume.
And now, with BigQuery Tiler, we’re able to collect that information without having to worry about the scale and compute required to visualize and process the data.