Spatio-temporal data mining (STDM) has become crucial in multimedia, driven by the surge of multimodal data from remote sensing, IoT sensors, social media, surveillance systems, mobile devices, and crowdsourced platforms.
Traditional single-modal methods, though successful, struggle to capture real-world complexity. Integrating multiple modalities yields richer, more accurate insights, boosting spatio-temporal analysis.
This half-day tutorial, MM4ST: Multimodal Learning for STDM, offers a comprehensive overview, covering STDM fundamentals, challenges in aligning and fusing heterogeneous data, advanced multimodal modeling techniques, and emerging research directions.
Attendees will acquire practical knowledge to develop scalable and robust spatio-temporal mining solutions. All materials will be publicly available online.
| Time | Speaker | Title |
|---|---|---|
| 11:00 am - 11:10 am | Roger Zimmermann | Opening and Introduction |
| 11:10 am - 11:20 am | Qingsong Wen | Background of multimodal learning and spatio-temporal data |
| 11:20 am - 12:00 pm | Siru Zhong | Multimodal learning techniques for ST data |
| 12:00 pm - 12:30 pm | Yuxuan Liang | Applications and Future Directions |
File: MM4ST@MM2025.pdf | Size: ~30 MB | Last updated: October 2025
Use the navigation buttons above to browse through the slides, or download the complete PDF for offline viewing.