ICME 2026 Special Session

Graph-Driven Cognitive Intelligence for Multimedia Understanding and Interaction

Paper Submission Deadline: December 12, 2026
Conference Dates: 5 July - 9 July 2026
Venue: Bongkok, Thailand

We will host a special session at ICME 2026 titled “Graph-Driven Cognitive Intelligence for Multimedia Understanding and Interaction.” We warmly welcome your submissions!

This proposal builds upon our continued interest in advancing intelligent multimedia understanding. Following the previous special sessions, “Multi-modal Visual Information Learning and Representation” (ICME 2024) and “Graph Neural Networks for Multimedia Data Analysis” (ICME 2025), we hope to further extend this line of inquiry toward cognitive reasoning and interactive intelligence. While the earlier sessions focused on perceptual representation and structured analysis, this proposal aims to explore how these foundations may evolve toward higher-level multimedia cognition.

In alignment with the ICME 2026 theme, “Beyond Perception: Intelligent Media in the Age of Autonomous Agents,” the proposed session seeks to examine how graph-based modeling and reasoning could potentially contribute to multimedia systems with richer cognitive awareness, adaptive interaction, and more autonomous decision-making capabilities. By encouraging discussions that connect perception with cognition and link structural modeling with intelligent behaviors, we hope this session can provide a timely forum for new ideas and emerging directions in next-generation multimedia intelligence.

Organizers

Xiangmin Han

Xiangmin Han

Tsinghua University

(Area Chair)

Shaoyi Du

Shaoyi Du

Xi’an Jiaotong University

Shihui Ying

Shihui Ying

Shanghai University

Mingxia Liu

Mingxia Liu

The University of North Carolina at Chapel Hill

Tsinghua University Xi’an Jiaotong University Shanghai University UNC Chapel Hill

Session abstract:

Multimedia data, including images, videos, audio, and interactive content, is fundamental to today’s intelligent media ecosystem. Its complexity creates challenges for understanding, reasoning and interaction because traditional learning paradigms often struggle to capture the cognitive and relational structures within such data. Graph-based and hypergraph-based learning provide effective frameworks for modeling high-order relationships and enabling structured and interpretable intelligence. Aligned with the ICME 2026 theme, this special session highlights emerging theories, algorithms, and applications that integrate graph-based reasoning, multimodal cognition, and adaptive interaction. It aims to promote discussion on cognitive graph models, interaction-aware architectures, and human-centered intelligent agents for next-generation multimedia understanding.

Topics of interest include but are not limited to:

  • Cognitive Graph Learning: Modeling high-order reasoning and semantic relationships across multimodal data.
  • Interactive Multimedia Understanding: Enabling adaptive perception–action cycles through graph-driven representations.
  • Agent-based Multimedia Intelligence: Integrating graph reasoning into autonomous and collaborative agents.
  • Cross-modal Graph Reasoning: Aligning vision, audio, and text modalities through structured cognitive graphs.
  • Knowledge-augmented Media Systems: Enhancing interpretability via graph-based knowledge integration.
  • Social and Behavioral Graph Analysis: Understanding user interaction and influence dynamics in intelligent media.
  • Graph Learning in Medical and Scientific Media: Capturing complex spatial, semantic, and relational patterns in high-dimensional multimodal biomedical data.

We propose this session to provide a dedicated platform for researchers and practitioners to discuss the next generation of graph-driven multimedia intelligence. Through the exchange of theoretical insights and practical advancements, participants will collectively advance the development of multimedia systems that move beyond perception toward cognitive understanding and interactive autonomy.