11-877 AMML

Multimodal machine learning (MMML) is a vibrant multi-disciplinary research field which addresses some of the original goals of artificial intelligence by integrating and modeling multiple communicative modalities, including language, vision, and acoustic. This research field brings some unique challenges for multimodal researchers given the heterogeneity of the data and the contingency often found between modalities. This course is designed to be a graduate-level course covering recent research papers in multimodal machine learning, including technical challenges with representation, alignment, reasoning, generation, co-learning and quantification. The main goal of the course is to increase critical thinking skills, knowledge of recent technical achievements, and understanding of future research directions.

Time: Tuesday & Thursday 2:00pm-3:20pm
Location: WEH 4709
Discussion and Q&A: Piazza
Assignment submissions: Canvas (for registered students only)
Contact: Students should ask all course-related questions on Piazza, where you will also find announcements.

Advanced Topics in MultiModal Machine Learning

11-877 • Spring 2024 • Carnegie Mellon University

Announcements