Multimodal Deep Learning for Robust Road Attribute Detection

Page view(s)
43
Checked on May 09, 2024
Multimodal Deep Learning for Robust Road Attribute Detection
Title:
Multimodal Deep Learning for Robust Road Attribute Detection
Journal Title:
ACM Transactions on Spatial Algorithms and Systems
Publication Date:
02 September 2023
Citation:
Yin, Y., Hu, W., Tran, A., Zhang, Y., Wang, G., Kruppa, H., Zimmermann, R., & Ng, S.-K. (2023). Multimodal Deep Learning for Robust Road Attribute Detection. ACM Transactions on Spatial Algorithms and Systems, 9(4), 1–25. https://doi.org/10.1145/3618108
Abstract:
Automatic inference of missing road attributes (e.g., road type and speed limit) for enriching digital maps has attracted significant research attention in recent years. A number of machine learning-based approaches have been proposed to detect road attributes from GPS traces, dash-cam videos, or satellite images. However, existing solutions mostly focus on a single modality without modeling the correlations among multiple data sources. To bridge this gap, we present a multimodal road attribute detection method, which improves the robustness by performing pixel-level fusion of crowdsourced GPS traces and satellite images. A GPS trace is usually given by a sequence of location, bearing, and speed. To align it with satellite imagery in the spatial domain, we render GPS traces into a sequence of multi-channel images that simultaneously capture the global distribution of the GPS points, the local distribution of vehicles’ moving directions and speeds, and their temporal changes over time, at each pixel. Unlike previous GPS-based road feature extraction methods, our proposed GPS rendering does not require map matching in the data preprocessing step. Moreover, our multimodal solution addresses single-modal challenges such as occlusions in satellite images and data sparsity in GPS traces by learning the pixel-wise correspondences among different data sources. On top of this, we observe that geographic objects and their attributes in the map are not isolated but correlated with each other. Thus, if a road is partially labeled, then the existing information can be of great help on inferring the missing attributes. To fully use the existing information, we extend our model and discuss the possibilities for further performance improvement when partially labeled map data is available. Extensive experiments have been conducted on two real-world datasets in Singapore and Jakarta. Compared with previous work, our method is able to improve the detection accuracy on road attributes by a large margin.
License type:
Publisher Copyright
Funding Info:
This research / project is supported by the GrabTaxi Holdings Pte. Ltd. / National University of Singapore - Grab-NUS AI Lab
Grant Reference no. : N.A

This research / project is supported by the Economic Development Board of Singapore - Industrial Postgraduate Program
Grant Reference no. : S18-1198-IPP-II

This research / project is supported by the Ministry of Education - Academic Research Fund Tier 2
Grant Reference no. : T2EP20221-0023
Description:
© Author | ACM 2023. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ACM Transactions on Spatial Algorithms and Systems, http://dx.doi.org/10.1145/3618108
ISSN:
2374-0353
2374-0361
Files uploaded:

File Size Format Action
tsas-si-sigspatial21-main-amended.pdf 2.41 MB PDF Open