Semantic indexing of multimedia content using textual and visual information
The challenge in multimedia information retrieval remains in the indexing process, an active search area. There are three fundamental techniques for indexing multimedia content: those using textual information, and those using low-level information and those that combine different information extracted from multimedia. Each approach has its advantages and disadvantages as well to improve multimedia retrieval systems. The recent works are oriented towards multimodal approaches. In this paper we propose an approach that combines the surrounding text with the information extracted from the visual content of multimedia and represented in the same repository in order to allow querying multimedia content based on keywords or concepts. Each word contained in queries or in description of multimedia is disambiguated using the WordNet ontology in order to define its semantic concept. Support Vector Machines (SVMs) are used for image classiﬁcation in one of the deﬁned semantic concept based on SIFT (Scale Invariant Feature Transform) descriptors.
multimedia retrieval; automatic annotation; semantic representation; multimodal image representation; SVM; SIFT; textual querying.