|Proceedings of the 2nd ACM International Workshop on Geotagging and Its Applications in Multimedia
|ACM, New York, NY, USA
|Pascal Kelm, Sebastian Schmiedeke, Jaeyoung Choi, Gerald Friedland, Venkatesan Nallampatti Ekambaram, Kannan Ramchandran, Thomas Sikora
|A Novel Fusion Method for Integrating Multiple Modalities and Knowledge for Multimodal Location Estimation
|This article describes a novel fusion approach for multiple modalities and knowledge sources that improves the accuracy of multimodal location estimation algorithms. The problem of "multimodal location estimation" or "placing" consists of associating geo-locations to consumer-produced multimedia data such as videos or photos that have not been tagged using GPS. Our algorithm effectively integrates the visual and textual modalities with external geographical knowledge bases by building a hierarchical model that combines both data-driven as well as semantic methods to group visual and textual features together into geographical regions. We evaluate our algorithm on the MediaEval 2010 Placing Task data set and show that our system outperforms the state of the art significantly, achieving to locate about 40% of the videos within an accuracy radius of 100m.
|multimedia analysis, geo-tagging, hierarchical segmentation, multimodal location estimation, centroid-based fusion
|isbn = 978-1-4503-2391-8
acmid = 2509238