|Title||Viewing Vision-Language Integration as a Double-Grounding Case|
|Publication Type||Conference Papers|
|Year of Publication||2004|
|Conference Name||Proceedings of the AAAI Fall Symposium on "Achieving Human-Level Intelligence through Integrated Systems and Research"|
While vision-language integration is important for a wide range of Artificial Intelligence (AI) prototypes and applications, the notion of integration has not been established within a theoretical framework that would allow for more thorough research on the issue. In this paper, we attempt to explore the reasons that dictate this content integration by bringing together Searle's theory of intentionality, the symbol grounding problem, as well as arguments regarding the nature of images and language developed within different AI subfields. In doing so, the Double-Grounding theory emerges which provides an explanatory theoretical definition for vision-language integration. In correlating the need for vision-language integration with inherent characteristics of the integrated media and in associating this need with an agent's intentionality and intelligence, the work presented in this paper aims at providing a theoretically established --and therefore solid-- common ground for currently isolated and scattered multimedia integration research in AI subfields.