Publications

conference paper

Conference/Proceedings	Proceedings of the 2nd ACM International Conference on Multimedia Retrieval
Start date	05.06.2012
End date	08.06.2012
Address	New York, NY, USA
Editor	ACM
Volume	ICMR 2012
Pages	251-258
Author(s)	Schmiedeke, Sebastian and Kelm, Pascal and Sikora, Thomas
Title	Cross-Modal Categorisation of User-Generated Video Sequences
Abstract	This paper describes the possibilities of cross-modal classification of multimedia documents in social media platforms. Our framework predicts the user-chosen category of consumer-produced video sequences based on their textual and visual features. These text resources---includes metadata and automatic speech recognition transcripts---are represented as bags of words and the video content is represented as a bag of clustered local visual features. The contribution of the different modalities is investigated and how they should be combined if sequences lack certain resources. Therefore, several classification methods are evaluated, varying the resources. The paper shows an approach that achieves a mean average precision of 0.3977 using user-contributed metadata in combination with clustered SURF.
Key words	multimedia analysis, genre classification, web video classification, multimodal decision fusion
Note	isbn: 978-1-4503-1329-2 articleno: 25 numpages: 8 location: Hong Kong, China
DOI	10.1145/2324796.2324828
URL	http://dl.acm.org/citation.cfm?id=2324828