Technical blog: Hosting Vector Tiles Securely

(Ed)itor’s note

In this blogpost, Hilman outlines details of one of the technical challenges we have recently faced at MobSciLab – hosting vector tiles while preserving data security. This post provides a step-by-step breakdown on how to do this successfully, but if you have any questions then get in touch!

Mobility data is often processed in vector format, whether as road networks, points, or polygons. Vector data can store multiple attribute values (or columns) and is represented in geometries stored in formats such as .geojson, .shp, or .gpkg. This data is commonly used for visualization in web maps or dashboards, helping users better understand its contents. To present or render vector data on a web map, we need to convert it into vector tiles.

This article provides a step-by-step tutorial on how we can present vector data in a web map—from converting a .geojson file into vector tiles, securely hosting them in Microsoft Azure, and finally loading them into a web map using Mapbox GL JS.

What are Vector Tiles?

Vector tiles have gained popularity in the past decade for presenting spatial data in web maps due to their ease of restyling on the client side, smoother transitions, and lighter file sizes. Before we dive into creating vector tiles, let’s first understand what they are.

Web maps allow users to interact with vector data in different zoom levels. The concept behind vector tiles is essentially slicing or “tiling” our data at various zoom levels. In web mapping, the highest zoom level (fully zoomed out) is 1, while the lowest (fully zoomed in) is 24. For example:

  • At zoom level 1, our vector data is stored in a single tile.
  • At zoom level 2, the same data is split into four tiles, increasing resolution and detail.
  • At zoom level 3, it is split into 16 tiles
  • And so on.

The idea is to store vectors in these tiles so that when we view a specific location on the map, our computer (client-side) retrieves only the necessary tiles and renders the vectors efficiently.

Vector Tiles Pyramid

Part 1: Creating Vector Tiles

1. Converting to GeoJSON

Let’s assume we have spatial data in any format, such as .gpkg, .geojson, or .shp. The first step is to convert our vector data into GeoJSON using the EPSG:4326 (WGS 84) coordinate reference system.

GeoJSON is the standard input format for Tippecanoe, the tool we will use to convert vector data into vector tiles. Since EPSG:4326 is the standard CRS for Mapbox, it ensures proper rendering in web maps.

We can convert our vector data to GeoJSON using various GIS tools or programming languages, including:

  • QGIS
  • ArcGIS
  • R
  • Python
  • etc

In this tutorial, I will give you example how to do it on python.

# import package
import geopandas as gpd

# Load file
df = gpd.read_file("input_path") #can be .shp, .gpkg, or .geojson

# Change to CRS EPSG:4326
df = df.to_crs(4326)

# Save into GeoJSON file
df.to_file("filename.geojson", driver="GeoJSON")

2. Slicing Vector Data into Vector Tiles

In this step, we will slice our GeoJSON vector data into vector tiles which, will be stored as .pbf files. Each tile will be saved as separate file, resulting in a structured folder containing multiple files in the format z/x/y.pbf, where:

  • z → Zoom Level
  • x → Tile column (longitude)
  • y → Tile row (latitude)

At the end of this process, we will have an output folder that serves a single vector layer, with multiple .pbf files representing different zoom levels and spatial sections.

To achieve this, we will use Tippecanoe, a tool that helps us convert vector data into vector tiles.

2.1. Installing Tippecanoe

Find Tippecanoe documentation in here.

First, we will install Tippecanoe in our system.

2.1.1. Installing on MacOS

Use Homebrew to to install Tippecanoe by running the following command::

$ brew install tippecanoe

2.1.2. Installing on Windows

On Windows, Tippecanoe must be run in a Linux environment. Follow this guide or check on this Linux installation tutorial on windows:

Step 1: Install Windows Subsystem for Linux (WSL):

  1. Open terminal or command prompt
  2. Install WSL by running:
wsl --install 
  1. Once installed, access the Linux environment by running:
bash
  1. You should now see a Linux terminal, indicated by a root@... prefix.

Step 2: Install Tippecanoe on Linux

Once inside the Linux environment, follow these steps to install Tippecanoe:

  1. download Tippecanoe:
wget <https://github.com/mapbox/tippecanoe/archive/1.34.0.zip> && unzip 1.34.0.zip
  1. Clone the Tippecanoe repository:
git clone <https://github.com/mapbox/tippecanoe.git>
  1. Navigate to the Tippecanoe directory:
cd tippecanoe
  1. Compile the software:
make
  1. Install Tippecanoe:
sudo make install

Step 3 Verify the Installation

To check if Tippecanoe has been successfully installed, run:

tippecanoe --version

or:

tippecanoe -v

If the installation was successful, the command should return the installed version of Tippecanoe.

2.2. Generating vector tile

Now that we have installed Tippecanoe, we can start generating vector tiles from our GeoJSON files by running the following command in Linux:

tippecanoe --minimum-zoom=5 --maximum-zoom=14 --layer=<layer_name> --output-to-directory=<output-folder-name> --drop-densest-as-needed  <input-file-geojson>.geojson  
tippecanoe --minimum-zoom=5 --maximum-zoom=14 --layer=westyorkshire_lsoa --output-to-directory=westyorkshire_lsoa --drop-densest-as-needed  --no-tile-compression westyorkshire_lsoa.geojson  
tippecanoe --minimum-zoom=9 --maximum-zoom=20 --layer=westyorkshire_road --output-to-directory=westyorkshire_road --drop-densest-as-needed  westyorkshire_road.geojson  

Here is the command breakdown:

  • tippecanoe: Executes the Tippecanoe command
  • –minimum-zoom & –maximum-zoom: Specifies the zoom levels for generating tiles. If a user zooms beyond the specified range, no tile will be rendered so it will not display anything. Alternatively, use -zg to allow Tippecanoe to automatically determine the optimal zoom levels.
  • –layer: Specifies the layer name, which will be useful when calling the vector layer in Mapbox GL JS. We can put any layer name we want.
  • –output-to-directory: Defines the output folder name. For example, if <your_output_folder> is msoa_boundaries, the structured folder will be msoa_boundaries/z/x/y.pbf.
  • –drop-densest-as-needed: removes the least-visible features at lower zoom levels if the file size is too large. It’s an optional command
  • –no-tile-compression: Prevents Tippecanoe from compressing .pbf files using GZIP. This is useful when hosting vector tiles on GitHub, as GitHub does not support modifying HTTP headers for gzip encoding. It’s an optional command, with out it Tippecanoe will automatically compress files into gzip

GZIP vs. Non-GZIP

GZIP is a compression standard that helps reduce the size of .pbf files, saving storage space and speeding up tile rendering. Therefore, very much suggested to use GZIP files. However, to ensure that a browser correctly interprets a GZIP file, the HTTP header must include:

Content-Encoding: gzip 

This HTTP header will tell user browser and mapbox that thhis file is gzipped so browser will decompress (un-gzip) the file before use it.

If hosting on GitHub, it is recommended to use non-gzipped files since GitHub does not allow encoding HTTP headers.

We will discuss further about changing about this in the hosting part.

Generating Multiple Vector Tiles in a Batch

If multiple GeoJSON files are in the same folder, use the following command to generate vector tiles for all of them:

for file in *.geojson; do
    filename=$(basename "$file" .geojson)  # Extract filename without extension
    layer_name="${filename}-layer"  # Define a general layer name
    output_dir="${filename}_tiles"  # Define output directory

    mkdir -p "$output_dir"  # Ensure the output directory exists
    tippecanoe --output-to-directory="$output_dir" --minimum-zoom=5 --maximum-zoom=14 --drop-densest-as-needed --layer="$layer_name" "$file"
done
for file in *.geojson; do
    filename=$(basename "$file" .geojson)  # Get the filename without extension
    region="${filename#processed_}"  # Extract region name by removing "lsoa_aggr_"
    layer_name="${region}-road"  # Format the layer name as "region_lsoa"
    output_dir="${region}_road"  # Define output directory

    mkdir -p "$output_dir"  # Ensure output directory exists
    tippecanoe --output-to-directory=../../04_tileset/01_road/"$output_dir" --minimum-zoom=9 --maximum-zoom=22 --drop-densest-as-needed --layer="$layer_name" "$file"
done
for file in *.geojson; do
    filename=$(basename "$file" .geojson)  # Get the filename without extension
    region="${filename#lsoa_aggr_}"  # Extract region name by removing "lsoa_aggr_"
    layer_name="${region}-lsoa"  # Format the layer name as "region_lsoa"
    output_dir="${region}_tiles"  # Define output directory

    mkdir -p "$output_dir"  # Ensure output directory exists
    tippecanoe --output-to-directory="$output_dir" --minimum-zoom=5 --maximum-zoom=14 --drop-densest-as-needed --layer="$layer_name" "$file"
done

What this script do:

  1. Loops through all .geojson files in the directory.
  2. Extracts the filename (without extension) to use as the layer name.
  3. Defines an output directory for each file.
  4. Generates vector tiles for each file while maintaining a consistent directory structure.
  • I use this code to convert geojson files with filenaming format processed_region.geojson
  • I want to make the output directory with region_road and layer with name region-road
  • So from the filename I have to extract the only the region and put it in the layer name and output directory name. Hence I use this code region="${filename#processed_}" , #processed_ means I take out the prefix “processed_”

This method ensures efficient processing of multiple GeoJSON files without manually running separate commands for each file.

Part 2 Hosting the Vector Tiles Securely

Up to this point, we have gzipped .pbf files of our vector tiles that we want to host securely. In this tutorial, we will host our tiles on Microsoft Azure and build an authenticated storage system.

Step Overview:

  1. Upload vector tiles to an Azure Blob Storage container
  2. Modify HTTP headers for proper tile delivery
  3. Configure access settings and generate a Secure Access Signature (SAS) URL

1. Uploading .pbf file to a container in azure

Why Use Azure?

Azure is a scalable and reliable cloud storage solution that allows secure hosting of vector tiles. It supports blob storage, which is ideal for serving large, unstructured datasets like .pbf files. Additionally, Azure provides access control through private containers and SAS tokens, ensuring secure data access.

1.1. Create an Azure Storage Account

What is an Azure Storage Account?

Azure Storage Account provides access to cloud storage services, including Blob Storage, which is used to store and manage vector tiles. It acts as a container for all storage resources, ensuring organised and secure file management.

Step to Create Azure Storage Account

  1. **Go to Azure Portal** (https://portal.azure.com/)
  2. Create an account, if you have university email you can use it for free account
  3. On the Azure services section, find Storage Account or in the search bar at the top, and click
  4. on The top left corner, click + Create
  5. Fill in the details:
    1. Subscription: choose your azure account subscription.
    2. Reseource group: choose a name for grouping related resources (e.g., vector-tiles-group).
    3. Storage Account Name: choose a unique name (e.g., vectortilestorage).
    4. Region: Find the closest region to your user for optimal tile delivery speed.
    5. Primary Service: choose Azure Blob Storage.
    6. Performance: choose Standard
    7. Redundancy: choose Locally-Redundant Storage (LRS) to keep three copies of the data within the same region. This option is cost-effective while ensuring basic data resilience.
  6. Click Review + Create, then click Create to create the storage account

1.2. Create Blob Container

After the storage account is created, we are going to create a blob container.

What is a Blob Container?

Blob container in azure is used for storing large amounts of unstructured data, such as vector tiles. It allows secure access control and efficient distribution of files over the web.

Steps to Create a Blob Container:

  1. In the Home page, click the storage account.
  2. Click the storage account that we just made.
  3. In the left-hand panel, navigate to Data storage, click Containers.
  4. click + Containers to add a new container.
  5. Name the container (e.g. tiles).
  6. On the Access Level, choose Private, to ensure secure storage and prevent public scraping
  7. Click Create

1.3. Upload PBF to a Container in Azure

Now we have an empty container that ready to store our vector tiles , and a folder containing the .pbf files, we can proceed with uploading them:

Steps to Upload Vector Tiles:

  1. Go to your storage account
  2. In the left hand panel, under Data storage, select Containers
  3. Click the container we just made (e.g., tiles)
  4. At the top, click Upload
  5. In the Upload Blob section, drag the whole vector tiles folder
  6. Click Upload to start the process

Once all the .pbf files has been uploaded, a notification will appear in the right-hand panel confirming the successful upload.

image.png

2. Change the Metadata

Although our vector tiles are now hosted in Azure Blob Storage, they will not load directly in Mapbox GL JS. This is because, by default, Azure assigns the content type as application/octet-stream when we upload data. However, vector tile files (.pbf) use the Protocolbuffer Binary Format (PBF), which does not match this default content type.

To ensure Mapbox GL JS correctly interprets our files, we need to modify the HTTP headers of our .pbf files by specifying:

  • Content-Type: application/x-protobuf
  • Content-Encoding: gzip

This tells Mapbox GL JS that our files are in .pbf format and are compressed using gzip, allowing Mapbox to automatically decompress them before use.

The HTTP headers can be modified manually by navigating to the Properties of each file. This will display the following properties menu:

Blob Properties

However, since a single vector tile set can contain hundreds or even thousands of .pbf files, manually updating each file would be extremely time-consuming. Instead, we will automate the process using the Azure SDK in Python.

Azure SDK is a collection of libraries that allow us to interact with Microsoft Azure programmatically. Specifically, the Azure Storage Blob library enables us to manage Azure Blob Storage, including modifying file properties.

To interact with Azure SDK in Python, we need a connection string, which serves as a key to authenticate and interact with a specific storage account. This connection string must remain private and should not be shared.

2.1. Retrieving the Connection String

  1. Open our storage account in the Azure Portal
  2. On the left-hand pane, under the “Security + networking”, click “Connection String”
  3. We will find “key1”, then copy the Connection string

2.2. Write Python code for Azure SDK

Before writing the script, ensure that the Azure Storage Blob package (azure-storage-blob) is installed. You can do this by running the following command in your terminal or Python environment:

pip install azure-storage-blob

Python code to change the HTTP header You can write this script in any Python environment, such as Jupyter Notebook, VS Code, or a simple Python script (.py file) on your local machine.

Once the package has been installed, use the following Python script to update the HTTP headers:

# Import necessary libraries
from azure.storage.blob import BlobServiceClient, ContentSettings

# Replace with your actual Azure Storage connection string
CONNECTION_STRING = "<your_connection_string>"
CONTAINER_NAME = "<your_container_name>"

# Define the desired content type and content encoding
NEW_CONTENT_TYPE = "application/x-protobuf"  # Correct MIME type for .pbf files
NEW_CONTENT_ENCODING = "gzip"  # Example encoding, adjust if needed

# Create a BlobServiceClient
blob_service_client = BlobServiceClient.from_connection_string(CONNECTION_STRING)
container_client = blob_service_client.get_container_client(CONTAINER_NAME)

# List all blobs in the container
blobs_list = container_client.list_blobs()

# Loop through and update .pbf files
for blob in blobs_list:
    if blob.name.endswith(".pbf"):  # Check if the file is a .pbf file
        blob_client = container_client.get_blob_client(blob.name)
        blob_client.set_http_headers(
            content_settings=ContentSettings(content_type=NEW_CONTENT_TYPE, content_encoding=NEW_CONTENT_ENCODING)
        )
        print(f"Updated: {blob.name} -> Content-Type: {NEW_CONTENT_TYPE}, Content-Encoding: {NEW_CONTENT_ENCODING}")

print("Update complete!")
  • Replace <your_connection_string> with the connection string we just copied
  • Replace <your_container_name> with the name of Azure Blob container (e.g., tiles).

The code is basically doing:

  1. Connects to our Azure Blob Storage using the BlobServiceClient.
  2. Retrieves a list of all blobs (files) in the specified container.
  3. Loops through the files and updates Content-Type and Content-Encoding for .pbf files.

Once the script finishes running, it will print "Update complete!" in the terminal. You can verify that the HTTP headers have been modified by checking the Properties section of your .pbf files in Azure.

3. Generate SAS URL from azure portal

All the z/x/y.pbf files for our vector tiles have been uploaded to Azure, and the HTTP headers for each file have been configured. Now, we are ready to use these vector tiles in Mapbox GL JS. We will call our vector tiles using URL format. However, since our container is set to private, we need a Shared Access Signature (SAS) to securely access the resources in Azure Storage. A SAS token grants delegated access to specific resources within Azure.

here are steps to generate SAS URL:

  1. Navigate to the Storage Account.
  2. In the left hand pane, under Data Storage, click Containers .
  3. On the container list, right click on the target container where the vector tiles stored.
  4. Click Generate SAS
  5. Fill this form:
    • Permission: check on Read
    • Start: setthe date when we want access to begin.
    • Expiry: set the date when access should end (it is recommended to set a far-future expiry date for long-term access).
    • Allowed Protocols: choose HTTPS only to make it secure
  6. Click Generate SAS Token and URL
  7. Copy the Blob SAS URL

4. Visualising the Vector Tiles in Mapbox GL JS

Once we have the SAS URL, we can integrate our vector tiles into a web map using Mapbox GL JS, a client-side JavaScript library for visualizing interactive maps. The vector tiles will act as a source layer on the web map, overlaying the base map.

We will place our vector tile SAS URL in map.addSource to call and render the vector tiles in Mapbox. However, we need to modify our SAS URL to match the vector tile URL format required by Mapbox.

Understanding the Blob SAS URL Structure

In the SAS URL there are 2 part separated by a ? delimiter :

  • Storage Resource URI: the ****path to the location of our tiles that will be read by mapbox.
  • SAS Token: A unique string of characters that grant access to the tiles resources

The Storage Resource URI follows this format:

https://<storage_account>.blob.core.windows.net/<container_name>/<vector_tiles_directory>

Where:

  • storage_account: name of our storage account
  • continer_name: name of our container
  • vector_tiles_directory: path to the vector tiles directory

4.1. Modifying SAS URL

To use the SAS URL in Mapbox GL JS, we need to modify it by appending <vector_tiles_directory>/{z}/{x}/{y}.pbf before the ? delimiter.

Thus, the final SAS URL format will be:

https://<storage_account>.blob.core.windows.net/<continer_name>/<vector_tiles_directory>/{z}/{x}/{y}.pbf?<SAS_Token>

Where:

  • {z}/{x}/{y}.pbf: place holder syntax for where the actual tile coordinates should be inserted dynamically. In a specific tiles it will be e.g. 12/2048/1366.pbf. Meaning tiles in zoom 12, column2048, row 1366.pbf.

Example of a Valid Vector Tile URL:

<https://vectortilesstorage.blob.core.windows.net/lsoa-tiles/westyorkshire-lsoa/{z}/{x}/{y}.pbf?sp=r&st=20-02-1:10:00Z&se=2030-0159:59Z&spr=https&sv=2022-11-02&sr=bg>

4.2. Adding the Vector Tile Source in JavaScript

Here’s the JavaScript code snippet to add the vector tile source to Mapbox GL JS:

map.addSource('vector-tiles', {
    type: 'vector',
    tiles: [
        'https://<storage_account>.blob.core.windows.net/<continer_name>/<vector_tiles_directory>/{z}/{x}/{y}.pbf?<SAS Token>'
    ],
    minzoom: 0,
    maxzoom: 14
});

By following these steps, you can successfully integrate Azure-hosted vector tiles into a Mapbox GL JS web application.

Sources

Transforming GeoJSON into Vector Tiles and Serving via Azure Blob Storage to UI | by Manikandan Thangaraj | Medium

Manage properties and metadata for a blob with .NET – Azure Storage | Microsoft Learn

Build Your Own Static Vector Tile Pipeline – Geovation Tech Blog

https://github.com/ITSLeeds/VectorTiles

Install Tippecanoe to make Mapbox Vector Tiles on a Windows 10 OS machine

Discover more from MOB.SCI.LAB

Subscribe now to keep reading and get access to the full archive.

Continue reading