Australian mapping organisation NGIS offers a free mapping kit at https://content.ngis.com.au/carto-free-mapping-kit
In it, the instructor shows how to answer a common GIS problem of how to calculate how many vehicles are needed to serve the customer base efficiently. The tutorial uses the awesome tools available at carto.com.
Bob’s Battery Buzzers is a road side assistance company that has had a strong presence in Melbourne. Looking to expand to WA, they bought a struggling company that has existing clients.
Now that they have completed the purchase they are looking to take their model and bring it to the West. The challenge is that CEO Bob doesn’t know how many vehicles would be needed to service the existing client base. Eager to make a good impression within budget, Bob needs to know whether seven cars would be enough for the whole city or if he needs to invest more up front.
You’ve been given the task and think that using a mapping product like CARTO would help you give Bob advice. Bob has given you the home addresses of the existing clients in Perth and wants a recommendation on how many cars to set up. Using the mapping tool (and the video instructions to guide you) produce a map that shows where the cars will be stationed and how they would cover the city.
But what if you’re a user of QGIS? Can you solve the same problem with that? Yes, of course!
If you haven’t done so already, sign up for the free mapping kit. They won’t spam you, and will provide the customer database needed for this exercise. I won’t provide the raw file here, sorry.
Upload the CSV to QGIS
NGIS provides a CSV with data of some 38,335 data points spread over the Perth metropolitan area.
Start a new Project in QGIS, and open the customers.csv file (Ctrl-Shift-T)
Locate the Map_kit_demo_customer_locations.csv file sent to you by NGIS. Ensure that Latitude and Longitude are pre-filled, and that the projection is WGS84:
Rename the imported file to customers to make things easier by right-clicking the layer name in the Layers pane:
Add a Base Layer
To get some context, a neutral base layer is always useful. Add a new XYZ Tile Layer that references ESRI’s Light Gray basemap. The URL you will need is
Reorder the layers so the basemap is at the bottom. By default, new layers are added to the top of the stack in QGIS.
Restyle the customer points so we can see what is going on. Choose a small point with no outline. Here I’ve used 0.5mm markers with fill color #d7191c with No Pen stroke style. This is what 33k markers look like!
Calculate Clusters of Points
OK it’s time to quantify the patterns lurking in the data with some analysis.
We’re going to use clustering to group our customers into areas derived from the geographic relationship between points. The NGIS tutorial uses seven areas, so we will too. Seven is the number of cars that Bob the owner thinks he will need to service the Perth area.
Open the Processing Toolbox if it is not already visible with Ctrl-Alt-T or go to Processing > Toolbox
Search for “cluster” in the Geoprocessing Toolbox, and select Vector analysis > K-Means clustering
K-means clustering: Calculates the 2D distance based k-means cluster number for each input feature.
If input geometries are lines or polygons, the clustering is based on the centroid of the feature.
- Change the Number of clusters to 7
- Leave the Cluster field name as CLUSTER_ID
- Click Run
There are at least 2 duplicate customer points, but that won’t impact results greatly.
A new layer called Clusters is added to the map. Rename the layer to *customers by clusterid” ready for some styling.
Style the *customers by clusterid” layer
- Change the method to Categorized
- Set the Value to CLUSTER_ID
- Change the Symbol to a smaller size say 1mm with no outline/stroke
- Leave the Color ramp as Random colors
- Click the Classify button.
The result should look something like this:
Calculate Cluster Centroids
At the moment we have 33 thousand colored points on a map – we don’t have any polygons or areas that define the regions defined by each cluster.
Let’s fix that by creating polygons from the data using a processing tool called Minimum bounding geometry.
Carto’s analysis tool called “Find centroids of geometries” presumably wraps the process of creating polygons and determining centroids into one tool.
Create polygons of clusters
- Go to the Processing Toolbox and seach for Minimum bounding geometry.
- Double-click the branch to open the dialog.
- Change optional Field dropdown that groups points into separate polygons to CLUSTER_ID
- Change *Geometry type** to **Convex Hull**
- Click Run
- When completed (~1 second), click Close
Back in the Layers pane a new layer has been added called Bounding geometry
- Reorder the layers so the Bounding geometry layer is below the customers layer, but above the ESRI Gray (light) layer.
- Rename the Bounding geometry layer to cluster_polygons
We now have polygons for each car in the fleet:
With our polygons ready, we can now determine the centroid, or “middle”, of each polygon.
Create cluster centroids
- In the Processing Toolbox, search for “centroid” and then double-click the branch Centroids under Vector geometry.
- Change the Input layer to our newly created cluster_polygons layer
- Check the box for Create centroid for each part, as we want all the centroids and not just the centroid of the layer as a whole. This is also known as dissolving
- Click Run
- Click Close
There is a new layer added called Centroids. Rename it to cluster_centroids
Style the Cluster Centroids
- Open the Layer Styling pane
- Leave the dropdown as Single Symbol, but select the “Simple marker” branch.
- Change the Symbol layer type from Simple marker to SVG Marker
- In the SVG Groups section, select shopping to filter
- Choose the car symbol
- Change the Fill color to red or similar
- (There should be No Stroke)
- Change the Size to Width 10 and Height 10 with Unit of Millimeters
Change back to the Layers pane, and un-check cluster_polygons and your map should look something like:
Create Areas of Influence
One of the project assumptions is that Bob’s cars “can only reliably cover a circle around them of about 8 kilometre radius” based on experience from Bob’s other work. We’ll use this to see what gaps we may have in our coverage of Perth.
So let’s create circle centred on each cluster centroid that have an 8 kilometre radius. But first, we need to change the Projection of our centroid layer to use Metres instead of degrees. We’ve been using EPSG:4326 up to this point which uses Degrees as the unit, so we need to Reproject to EPSG:7850 or another metres-based projection.
Reproject Centroids to a Metres-based Projection
- In the Processing Toolbox, search for reproject
- In the Vector general branch, double-click on Reproject layer
- Change the Target CRS to EPSG:7850 – GDA2020 / MGA Zone 50
- Click Run
- Click Close
We now have a new layer called Reprojected. Rename it to cluster_centroids_7850
Create Buffers around Centroids
- In the Processing Toolbox, search for buffer
- In the Vector geometry branch, double-click on Buffer
- Ensure the Input layer is cluster_centroids_7850
- Change the Distance to 8000 and leave the units as meters
- Leave the Segments, End cap style, Join style, and Miter limit as is.
- Check the Dissolve result so we can see where circles overlap (if they do)
- Click Run
- Click Close
Optional: The map is updated and looks a bit “wonky” – change the projection of the map by clicking in the bottom right of the QGIS screen, and selecting EPSG:7850
- Set Buffered to be
- Color: #e5e5e5 (grey)
- Opacity: 50% so we can see underneath
- Rename Buffered to areas_of_influence
With the areas of influence circles set to 8km, there are lots of customer data points that are outside the circles and not easily reachable.
So now we see that 7 cars is not going to be enough to cover the Perth metro area. The next question is – what is the optimum number of cars required?
Rerun the above step with 12 clusters, and see what happens.
In Carto, this step is very clever in that it reruns the other required steps after defining how many clusters we want. In QGIS, this can be automated too using Python and/or Model Builder., and we will do that in another article.
To run with 12 clusters, the core steps were:
- K-means Clustering of customers
- Confirm it worked by styling Clusters by CLUSTER_ID
- Create cluster polygons with Minimum bounding geometry on Clusters using CLUSTER_ID field and Convex Hull geometry type
- Generate the centroids using Centroids on Bounding geometry with Create centroid for each part checked
- Reproject Centroids using Reproject layer set to EPSG:7850 Zone 50
- Create Buffer on Reprojected of 8000 metres Dissolved.
Using 12 cars, we generate the following analysis:
What percentage of customers are covered in each scenario of 7, 12 and 14 cars?
By using the “Count points in polygon” function in the Processing Toolbox we can determine how many customers can be reached by the fleet.
The percentages of customers that are within the generated Areas of Influence are:
So with 14 cars we can reach 87% of the existing customer base by staying within the 8km radius.
This analyses could be refined using time-based areas of influence, and perhaps it could be automated to rerun each step as required without input.
Save the Project, if you haven’t yet done so.
We also need to save all the temporary layers. They’re indicated by a computer chip in the Layers pane implying the file is stored in memory. If we don’t, they will be deleted when we exit QGIS which is not fun for anybody! Or, use the plugin called Memory Layer Saver which will save all temporary layers in a portable binary format and reopened when the project is opened.
- Right-click the temporary layer, and choose Make Permanent from the menu:
- Choose Geopackage as the most flexible format
Use groups in the Layers panel to group the 7 car, 12 car and 14 car analyses.
The final project Layers may look something like this: