1. Insights
  2. AI Data
  3. Case Study

Curating high-quality data for the training and validation of ADAS and AV models

Discover how TELUS Digital used our proven field operations testing (FOT) experience to create a high-quality dataset for training advanced driver assistance systems (ADAS) and autonomous vehicles (AV).

An autonomous vehicle shown driving on a busy highway.
  • Share on Facebook
  • Share via email

12TB
data captured daily

7,500km
approximate total distance covered

4
project duration in weeks

The challenge

Training and validating the artificial intelligence behind critical ADAS and AV features requires large volumes of high-quality data from real-world driving scenarios. Obtaining datasets that offer the maximum coverage of routes within tier one cities, however, can be a challenge that’s both costly and time-consuming for developers. Knowing that multiple customers require this data, TELUS Digital embarked on a detailed mapping project to collect and curate high-quality data for training and validating ADAS and AV models.

The TELUS Digital solution

The greater Los Angeles area served as the core location for our data collection project. To create a diverse dataset, we employed our extensive FOT experience and included the following criteria:

  • Vehicle procurement and preparation: We secured two vehicles, ensuring they met all regulatory and compliance requisites, and equipped them with advanced cameras, sensors and data-acquisition systems.
  • Optimized route planning: Our drivers consistently drove less than 20 miles per hour for optimum scenario capture. They also drove in varying weather conditions and at different times of the day (50% day and night). The vehicles traversed a variety of driving scenarios, encompassing 70% highways, 20% urban and 10% suburban and rural environments.

  • Sensor synchronization: We ensured the vehicles’ sensors were maintained and synchronized regularly using global navigation satellite system time servers. Each sensor was precisely timestamped, and the data was collected in a variety of formats for seamless integration into machine learning (ML) workflows.

The results

By training autonomous systems on our diverse annotated dataset, models can learn from and adapt to real-world situations. Our pilot dataset, which covered a high-density view of various driving conditions in a tier one city, was successfully deployed by a leading global automotive manufacturer. Some key highlights from the project included:

  • Vehicle-to-cloud data transfer time: 1 week
  • Daily data capture: 12 terabytes (TB)
  • Total distance covered: 7,500 km
  • Highway distance covered: 5,250 km
  • City distance covered: 1,500 km
  • Rural distance covered: 750 km
  • Project duration: 4 weeks

Additional data collection projects to train AV models continue across high-demand cities in California, Michigan and Washington.


Check out our solutions

We can help help with your data collection and data creation for all of your machine learning needs.

Learn more

Related insights