AI & LiDAR: Paving the Way for Smarter Road Repairs

Author: Stu Feeser

image image

How LiDAR works

  1. Pulsed Laser: LiDAR systems emit rapid pulses of laser light at a surface. Some of this light is reflected back to the LiDAR sensor, where it is detected and measured.

  2. Time of Flight: By calculating the time it takes for each laser pulse to bounce back to the sensor, LiDAR can measure the distance each light pulse travels. This process is known as “time of flight” measurement.

  3. 3D Models: Millions of these distance measurements are collected in a short period and then used to generate detailed three-dimensional models of the Earth’s surface and features.

Applications in Road Surface Examination

  • Surface Analysis: LiDAR can provide 3D images of road surfaces, identifying wear and tear, potholes, and other surface anomalies that need repair.

  • Safety Improvements: By mapping road features with high precision, LiDAR helps in designing safer roads by identifying potentially hazardous conditions not easily visible to the human eye.

Advantages of LiDAR in Transportation

  • Precision: LiDAR provides highly accurate measurements, which are crucial for the detailed analysis required in road maintenance and infrastructure management.
  • Efficiency: Data collection is rapid, allowing large areas to be covered in a short amount of time, minimizing disruptions to traffic and daily operations.
  • Versatility: LiDAR can be deployed from various platforms, including drones, airplanes, and ground vehicles, making it adaptable to different scales and scopes of projects.

Managing miles upon miles of LiDAR images to determine where road work is needed is an incredibly challenging and labor-intensive task for engineers. The stark truth is that without AI, this process can be daunting, inefficient, and prone to human error, for several compelling reasons:

Complexity of Analysis

The data collected by LiDAR is complex and multidimensional. Each point in a LiDAR dataset contains information on position, elevation, and sometimes reflectivity. Extracting meaningful insights, such as the condition of the road surface, requires sophisticated analysis that considers the spatial relationships and characteristics of these points.

Precision and Accuracy

Identifying road defects like cracks requires precision and accuracy. Human analysts may vary in their assessments, and fatigue can lead to inconsistencies and oversight. Moreover, the precise quantification of defects (such as exact measurements of cracks) is critical for planning repairs and maintenance. Achieving this level of detail manually across vast datasets is practically impossible.


Road conditions can deteriorate quickly, and the timely identification of issues is crucial to maintaining safety and preventing minor issues from developing into major problems. Manual analysis processes are slow and cannot keep pace with the rapid assessments required to prioritize and address road maintenance effectively.


Manual inspection and analysis of road conditions are cost-prohibitive at scale. The labor costs associated with extensive manual reviews, combined with the potential for delayed identification of critical issues (leading to more expensive repairs), make manual processes economically inefficient.

The Role of AI

AI, particularly machine learning models trained on LiDAR data, can automate the detection and classification of road defects. These systems can process vast quantities of data far more quickly than human analysts, with consistent precision and accuracy. AI can work around the clock without fatigue, ensuring timely analysis. By identifying exactly where maintenance is required, AI-enabled systems can help prioritize repairs based on severity and urgency, optimizing the allocation of limited road maintenance resources.

In essence, AI transforms an insurmountable task into a manageable one, enabling proactive road maintenance that can save money, time, and lives. The application of AI in processing LiDAR images for road maintenance is not just a matter of convenience or efficiency—it’s a necessity for modern infrastructure management.

Now, some math to compare swin to ViT Transformers

According to this source page 93, imaging systems must be capable of producing images in which - 1mm crack is visible when confirmed at slow speed - 3mm crack at speeds exceeding 60 mph. I will use the 1mm crack detection rule, and assume the scanning width is a 4 meters wide lane. Using the nyquist theorom to assure that 1mm crack is detected, sampling needs to be every .5 mm. Therefore, the road scan, for 4 meters must be 8000 points, every .5 mm of forward motion. So a good rule of thumb here is to image you are watching that brand new 8K tv screen, perfectly showing one lane of a road surface, and it will take about 745 frames to view one mile. Modern TVs can do at least 120 fps, so that mile of road surface would scroll by in about 6.2 seconds. But the way that AI would have to view the same thing is: 1609 meters/mile x 8000 points wide x 2000 points per meter = 25,749,000,000 points. If each point is encoded as follows:

  • 8 bits: Common for intensity values.
  • 16 bits: Typical for elevation or distance measurements, providing a good balance between data size and precision.
  • 32 bits (floating point): Used for high-precision measurements.
  • 24 bits (for RGB data): Used if color information is captured.

To demonstrate the computational efficiency difference between Vision Transformers (ViT) and SWin Transformers using the provided data, we can look at their computational complexity, particularly focusing on their scaling with respect to the input size (i.e., the number of patches).


  • Total initial patches for an 8K x 4K image: 129,600 patches.

ViT Computational Complexity: (O(N^2))

For ViT, the computational complexity is generally (O(N^2)) with respect to the number of input patches ((N)), due to the self-attention mechanism that computes interactions between all pairs of patches.

SWin Transformer Computational Complexity: (O(N))

SWin Transformers, by contrast, introduce an efficient shifted windowing scheme that significantly reduces the computational complexity. While the exact complexity can vary based on implementation details, the hierarchical nature and local window-based attention generally lead to a linear scaling with respect to the number of patches, which can be approximated as (O(N)) for large-scale inputs.

Calculating and Comparing Computational Costs

To compare ViT and SWin in terms of computational costs for the given scenario, we’ll compute a simplified model of their computational cost based on the number of patches.

  • For ViT: The cost can be represented as (N^2), where (N = 129,600).
  • For SWin: The cost is linear, so we’ll represent it as (N).

Let’s calculate the relative computational cost to demonstrate the efficiency gain with SWin. Since the actual computational cost involves more factors (like dimensionality of the model, specifics of implementation, etc.), we’ll focus on the scaling behavior as a proxy for computational efficiency.

For the given scenario of processing an 8K x 4K image:

  • The approximate computational cost for ViT, following an (O(N^2)) complexity, is (16,796,160,000) operations.
  • For SWin Transformer, with an (O(N)) complexity, the cost is (129,600) operations.

Comparatively, the ViT model would require approximately (129,600) times more computational operations than the SWin Transformer for the same initial number of patches. This stark difference illustrates the potential computational efficiency gains with SWin Transformers, especially for large-scale inputs, highlighting why SWin can be more effective in terms of computational resources for processing high-resolution images.

As I conclude this exploration of the transformative role of AI and LiDAR in road maintenance, it’s crucial to remember a key insight in this discussion: not all vision models are created equally. Through the lens of this post, you can see stark differences between SWin and ViT Transformers, showcasing how computational efficiency and adaptability to the vast landscapes of LiDAR data set them apart. This comparison underscores the importance of selecting the right AI tools for the monumental task of infrastructure analysis and maintenance. The combination of AI and LiDAR is not merely a technological upgrade; it is a paradigm shift towards smarter, proactive road care that promises to redefine our approach to infrastructure health. By harnessing the strengths of advanced vision models like SWin, we open the road to a future where maintenance is not just reactive but predictively mapped out with precision, ensuring our paths are safer and more durable for everyone. The journey through “AI & LiDAR: Paving the Way for Smarter Road Repairs” marks the beginning of this exciting transition, steering us towards smoother, more reliable roads in the era of smart infrastructure.