|Title||Image-Language Multimodal Corpora: needs, lacunae and an AI synergy for annotation|
|Publication Type||Conference Papers|
|Year of Publication||2004|
|Authors||Pastra, K, Wilks, Y|
|Conference Name||Proceedings of the 4th Language Resources and Evaluation Conference|
The growing demand for intelligent multimedia systems has led to the development of various multimodal resources and corresponding annotation schemes and processing tools. In this paper, we argue that there is a striking lack of multimodal corpora capturing the association and interaction of visual and linguistic data. We relate this research lacuna to vision-language integration prototypes developed within Artificial Intelligence (AI) and show how the needs of the latter dictate the development of such resources for a wide variety of applications. We identify the annotation requirements imposed on image-language corpora by these needs and the nature of the modalities involved and suggest a semi-automatic way of meeting them.