Data Collection

Collecting Public Transport Data in Gauteng: One of Africa’s Largest Urban Regions

South Africa’s Gauteng province contains an urban conglomeration made up of the municipalities of Johannesburg, Tshwane and Ekurhuleni. Home to nearly 13 million people, it is one of the largest urbanised regions in Africa. We first collected 44,000kms of data from the city’s informally run public transport network in 2017, offering access to organisations looking to gain value from locally relevant information.

Extending deep into the peripheries of this sprawling city region, informally run public transport serves parts of Gauteng the formal network currently doesn’t reach. South Africa’s ubiquitous white minibus taxis address the demand for mobility created by the distance between people’s homes and the economic hubs they work in, something which remains a dominant post-apartheid spatial influence.

Access to data is inconsistent across public transport modes. Formal services are often fully or partially digitalised, sharing timetables, fare information, and journey planning services with passengers. Access to reliable data about informally run modes, however, is lacking. The province’s 2,800 routes serve millions of people, most of whom rely on their personal knowledge of the system, word of mouth, or the occasional signboard to work out where they are going. Individuals and businesses looking to benefit from contextual and locally relevant information find that data from the city’s major mobility network is unavailable.

This imbalance partially originates from the fact that informally run public transport has largely been excluded from the benefits of digitalisation. As a result, the system itself is harder to understand and use, and services which benefit from reliable and accessible data on these systems are impossible to develop. Overcoming this digital exclusion is a huge opportunity – for cities, citizens, and businesses. Collecting data from informally run public transport, though extremely complex because of the system’s demand-based operation, is essential to overcome this and make tangible, real-world value possible.

Video thumbnail

After spending much of 2017 collecting data in South Africa’s major metropolitan regions, and refining our technical and operational approach in the process, we’ve completed our largest project to date. WhereIsMyTransport’s Data Collection Coordinator Claire Enslin explains:

“The informally run public transport network in Gauteng expands beyond the administrative boundaries of its cities. We always collect data from the functional informally run network, ensuring accurate and reliable information.” Running a Data Collection Project in Gauteng

Through online services and work with local project partners, we recruited and trained a team of data collectors. Working with those who live in the area and regularly use local systems means we were equipped with a team who had extensive knowledge of Gauteng’s informally run public transport and were able to offer valuable qualitative input throughout the collection process.

We developed unique technologies for collection which enable reliable data to be efficiently captured. Our bespoke mobile application collects route data and metadata, including on- and off-peak timings, common stopping points, fares, and frequency. Collecting data through an in-house mobile application built specifically for the unique nature of informally run public transport enabled efficient collection with less risk of human error.

For maximum accuracy, all data collected in Gauteng was verified through human and computational means. During daily alignment meetings with our collection team, we planned for the day ahead and undertook one-to-one sessions to go through their tracking data in detail. During these sessions, we validated all route information as well as discussed important exceptions, including unidirectional routes, unusual days of operation, or any piece of information which could be flagged as an error.

Our Data Collection Coordinator Graeme Leighton explains that “we meet with our data collection team every day, talking to them about their overall experience – not just the data they’ve collected.” Enslin adds, “we understand the usefulness of this qualitative input and value its role in our data collection service.”

Data is further validated by our unique backend tools which clean and ‘snap’ data to the actual road network. Our approach ensures that all active routes are collected, resulting in data of the highest possible standard.

For Gauteng, our industry-leading project management and technical processes meant that on-the-ground data collection of over 2,800 unique routes, 44,000kms in length, took less than 4 weeks. After running data collection projects in all South Africa’s major metropolitan regions in the last year, our technologies, methodologies and team mean we’re equipped to efficiently collect public transport data in cities of any size.

The Potential of Public Transport Data Access in Gauteng and Beyond

The potential for this information begins after the data collection process is complete. For Gauteng’s cities and the other regions collected by WhereIsMyTransport, comprehensive data from the on-the-ground reality of informally run public transport is now available.

The value of this data is significant. Understanding centres of activity is enabled, making data-driven decisions possible. Location data can be integrated into existing tools, providing the basis for intelligence, insights and analysis. Organisations can increase both their value to users and geographical reach through integrating locally relevant information – such as the location of routes or stops – in digital services like maps, on-demand transport, or journey planning.

On this milestone, our CEO Devin de Vries says:

“This data collection milestone is significant not just because of the task of digitalising public transport information in an urban mega region, but because it means that unparalleled public transport data is now available from all of South Africa’s major metropolitan regions. The expansive data reach of WhereIsMyTransport is the best in industry, making us a reliable partner for individuals and organisations looking to add value through locally relevant information in African cities.”

We’re working with partners who share our belief in the potential of this data for Gauteng’s cities and beyond. Could your business benefit from the unique opportunity of public transport data from South Africa’s cities?

Get in touch

Discuss how WhereIsMyTransport can bring the same value to your project.

Contact us

More Case Studies

  • Developers

    Making Accurate Public Transport Information Easy for the MyCiti App

  • Data Collection

    Integrating the Informal: Collecting Data from Cape Town’s Minibus Taxi Network