Latest version (in pdf) available on Piazza.

Course Format

Lectures will be performed Tuesdays and Thursdays at 10:10 AM. Each lecture will focus on a specific mathematical concept related to multimodal machine learning. These lectures will be given by the course instructor, a guest lecturer or a TA.

Canvas: We will be using Canvas for all course assignments. Students should be automatically added to the course Canvas site.

Zoom: We will use the Zoom video platform for the live lectures on Tuesdays and Thursdays. Links to the live lectures will be available on Canvas. The lectures will also be recorded to allow students to watch them again later. Please make sure that your Internet connection and equipment are set up to use Zoom. During our class meetings, please keep your mic muted. If you have a question or want to answer a question, please use the chat or the “raise hand” feature (available when the participant list is pulled up). A TA will be monitoring these channels in order to share this information with the instructor. All course lectures. will be audio/video recorded, to allow students to watch asynchronously the lectures, if needed.

Piazza: We will be using Piazza for class communication and announcements. The system is highly catered to getting you help fast and efficiently from classmates, the TAs and the instructor. Rather than emailing questions to the teaching staff, you are encouraged to post your questions on Piazza. You can post privately to the instructor and TAs through Piazza website. Piazza can be accessed from the course Canvas page, or directly at this URL:

Course Material


  • Reading material will be based on published technical papers available via the ACM/IEEE/Springer digital libraries or freely available online (e.g., All CMU students have already free access to these digital archives.
  • For project assignments, previous experience in Python programming is expected


  • Deep Learning, Ian Goodfellow, Yoshua Bengio and Aaron Courville, MIT Press, 2016 (freely available at
  • The Handbook of Multimodal-Multisensor Interfaces, Sharon Oviatt, Bjoern Schuller, Philip R. Cohen, Daniel Sonntag, Gerasimos Potamianos and Antonio Kruger, Volumes 1, 2 and 3, 2017- 2019 (available through CMU Library online)
  • Machine Learning for Audio, Image and Video Analysis: Theory and Applications, Francesco Camastra and Alessandro Vinciarelli, Springer, 2008, DOI: 10.1007/978-1-84800-007-0 (freely available on SpringerLink for CMU students)
  • Multimodal Processing and Interaction, Gros, Potamianos and Maragos, SpringerLink, 2008, DOI: 10.1007/978-0-387-76316-3 (freely available on SpringerLink for CMU students)
  • Multimodal Signal Processing: Theory and applications for human-computer interaction by Jean-Philippe Thiran, Ferran Marqués and Hervé Bourlard. Academic Press, ISBN: 978-0-12- 374825-6

Project Assignments and Timeline

(See Piazza for additional information)

  • Dataset preferences (Due on Tuesday 9/6 at 8pm ET) – Let us know your preferences for the datasets that you would like to use for the course project. This will help with the team matching process.
  • Project Pre-proposal (Due on Wednesday 9/14 at 8pm ET) – You should have selected your teammates, dataset, and task. Submit a 1-page pre-proposal plan.
  • First assignment (Due Sunday 9/25 at 8pm ET) This assignment requires a good literature review on the topic of your proposed project.
  • Second assignment (Due Sunday 10/9 at 8pm ET) – This assignment focuses on unimodal representations.
  • Midterm assignment (Report is due Sunday 10/30 at 8pm ET and presentations are scheduled Tuesday and Thursday 11/1 and 11/3) – Students are asked to implement and evaluate state- of-the-art baseline models on their project dataset and perform error analysis.
  • Final assignment (Presentations are planned Tuesday 12/6 and Thursday 12/8; reports are due Sunday 12/11 at 8pm ET) – Students should explore new ideas to model their multimodal research project.


Remember: If you registered for this class, you have until November 9th to change your grade in this course from a letter grade to a Pass/Fail grade.

  • Grading breakdown
    • Lecture participation and highlights: 16%
    • Reading assignments: 16%
    • Course project assignments
      • Project preferences and pre-proposal: 2%
      • First project assignment: 10%
      • Second project assignment: 10%
      • Mid-term report and presentation: 20%
      • Final report and presentation: 30%
  • Lecture participation and highlights
    • Lectures can be attended live (using Zoom) or watched later. Students are encouraged to attend lectures live as often as possible, to allow them to ask live clarification questions, if needed. Some lectures will also contain some live survey questions.
    • While watching the lecture (either live or recorded video), students are required to fill a form where they include their main takeaways from the lecture (aka, highlights).
    • The form should be submitted the same day as the lecture, before 11:59pm ET. For example, if the lecture was scheduled Tuesday, then the highlight form is due on the same Tuesday at 11:59pm ET
    • Students need to use the provided online template for the highlight form. This form was designed for two main purposes: (1) help students for taking active notes during lectures, and (2) offer students the opportunity to ask questions about the content of the lectures.
      • The highlight form will contain 3 sub-sections, one for each of the 30-minute segments of the lecture (the last segment may be shorter).
      • For each sub-section, students are asked to include a short statement summarizing the main takeaway message of the past segment.
      • Optionally students can also write down a question (with corresponding slide number) related to the segment.
    • The student’s questions will be reviewed by TAs and instructor. The most popular questions will be answered using Piazza, or with extra information during the following lecture. Students are always welcome to post questions directly on Piazza at any time if they would like clarifications or have a follow-up question.
    • These highlight forms will not be required for the first week and for the Thanksgiving week. Also, no forms are expected for weeks when a project assignment (first, midterm or final) is due. We expect about 20 lectures where highlight forms need to be submitted.
    • Each submitted form will be graded for 1.0 point. The top 16 scores will be kept for the lecture participation final grade.
  • Reading assignments
    • Reading assignments are designed to complement the lectures and showcase recent state-of-the-art research. Most reading assignments will consist of 2 or 3 research papers, sometimes accompanied by optional readings. The list of research papers will be released at the latest on the Monday of each week.
    • To encourage exchange of ideas and knowledge between students, each student will be part of one study group. A study group consists of 9-10 students. These groups will be randomly created, to encourage diversity in these groups. Each study group will have its own discussion forum to ask questions and share ideas.
    • The reading assignments consist of two main parts: (a) submission of discussion post summarizing the paper you read that week (see more details below), and (b) active participation in the follow-up discussions, including at least 2 extra posts.
      • For each reading assignment, each student is required to read only one research paper (out of all assigned papers). Students need to write a summary statement for their paper and post it in the discussion forum before Friday 8pm ET. These summary posts will allow other study group members to learn about the papers they did not read directly, and possibly ask follow-up questions.
    • Following this Friday deadline, students are expected to read at least one summary for each paper they did not originally read and write a follow-up post to discuss similarities and differences with the paper they read and suggest some follow-up ideas. One follow-up post should be created for each paper the student did not read. For most reading assignments, the total number of paper options is 3, making the number of follow-up posts equal to 2.
    • Reading assignments will be released weekly, with exceptions when a project assignment (first, midterm or final) is due the same week. Also, no reading assignments during the first week and during Thanksgiving week. We expect 10 reading assignments during the semester.
    • Each reading assignment is worth 2.0 points: 1.0 point for the paper summary and 1.0 point for the short discussion essay.
  • Course project assignments:
    • The goal of the course project is to experiment with state-of-the-art multimodal algorithms and computational models.
    • Students should create teams between 3 to 5 students preferably (special approval will be required for larger teams; no smaller teams will be allowed). The size and depth of the project should be adjusted to reflect the size of the team.
    • Each team is required to create a code repository (github) for their project. All project members should be included in this project and should actively use it. It is important that all team members participate equally to the project. The first project assignment and follow-up reports (midterm and final reports) will need to outline the tasks of each student. If any team member has concern in the participation level of other members, they should contact the instructor and/or TA as promptly as possible.
    • Students have flexibility in the selection of their project topic. The project should be directly aligned with the course content and include at the minimum two modalities, preferably language and vision. At the beginning of the semester, the instructor will propose a set of research problems and datasets which can be used for the course projects.
    • Pre-proposal: We ask students to prepare a pre-proposal early in the semester to them establish their research topic for the course project. The pre-proposal will also help with team formation, in the rare eventuality when students are still looking for teammates.
    • First project assignment: The first project assignment consists of a written report and an oral presentation (which will be performed remotely, with pre-recorded videos). This assignment has two main goals: describe in more details the plan for the course project (aka., “proposal”) and perform some unimodal analyses on the multimodal dataset and problem.
      • Peer feedback: Following the submission of the proposal video recordings, students will be asked to watch the videos and share feedback. Each student will be assigned a subset of videos to watch. Feedback will also be given by instructor and TAs.
    • Midterm project assignment: The midterm project assignment is designed to implement multimodal baseline models and perform some error analysis on these results. This assignment also has two components: written report and oral presentation. By the submission time for the midterm assignment, students should have already implemented some of the state-of-the-art baseline models for their selected multimodal task and dataset.
      • Peer feedback: Like the first assignment, students will be asked to watch the midterm presentation videos and share feedback. Each student will be assigned a subset of videos to watch. Feedback will also be given by instructor and TAs.
    • Final project assignment: Using the same dataset and task selected for the midterm report, the final project assignment is designed to explore new research ideas. This assignment is not graded based on the quality of the results, but instead on the exploration of new ideas (e.g., better accuracy results will not mean better course grade). Students are encouraged to explore new research directions. The final project assignment also contains written and oral components.

Note about late submissions

In general, submitting assignments on time lets the instructional team provide feedback in a more timely and efficient manner. Timely submissions are particularly important for assignments with discussions and peer feedback, such as the reading assignments and the project assignments. Also, it is expected that students will attend the lectures in person. Live attendance allows students to participate in the discussions and ask any clarification questions.

Medical-related absences If for medical reason you require some extra time for an assignment or may not be able to attend the lecture in person, please contact instructor and the TAs as soon as possible (the best option is usually via Piazza) and we will help define a new plan that aligns with your constraints.

Late submission wildcards We offer students and project teams some late submission wildcard to help deal with potential overlaps with other courses or research deadlines. The details are expressed below. Reading assignments and lecture highlights

  • Each student will receive six (6) late submission wildcards, to be used individually.
  • Each wild card can be used to extend the deadline up to 24 hours from the original time.
  • No partial credit for the wild card (e.g., it is not possible to use only half of the card, with two times 12 hours).
  • These wildcards can be used together (for a total of 48 hours), or separately (2 separate extensions of 24 hours).
  • There is no need to send a note via Piazza about theses wildcards. We will automatically use your wildcards your first two late submissions.
  • For any other late submission (beyond the two wildcards) then 0.5 points will be deducted.
  • If you must submit beyond 72 hours past the due date, please contact instructor and the TAs as soon as possible so we can properly plan.

Project assignments

  • Each team will get two (2) wild cards, to be used with any of the project assignment deadlines.
  • Each wild card allows the team to submit their assignment late for up to 24 extra hours.
  • These wild cards can be used together (for a total of 48 hours), or separately (2 separate extensions of 24 hours).
  • Each wild card can be used for any of these 4 deadlines:
  • First assignment deadline
  • Second assignment deadline
  • Midterm report deadline
  • Final report deadline
  • No partial credits for the wild cards (e.g., you cannot use only 40% of a wild card).
  • Each team needs to send a message to instructors BEFORE the deadline to notify TAs and instructor about the intent to use a wild card (or two).

Accommodations for Students with Disabilities

If you have a disability and have an accommodations letter from the Disability Resources office, I encourage you to discuss your accommodations and needs with me as early in the semester as possible. I will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, I encourage you to contact them at

Statement on Student Wellness

This semester is unlike any other. We are all under a lot of stress and uncertainty at this time. Attending Zoom classes all day can take its toll on our mental health. Make sure to move regularly, eat well, and reach out to your support system or me ( if you need to. We can all benefit from support in times of stress, and this semester is no exception.

As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty concentrating and/or lack of motivation. These mental health concerns or stressful events may diminish your academic performance and/or reduce your ability to participate in daily activities. CMU services are available, and treatment does work. You can learn more about confidential mental health services available on campus at: Support is always available (24/7) from Counseling and Psychological Services: 412-268-2922.

Diversity statement

Every individual must be treated with respect. The ways we are diverse are many and are fundamental to building and maintaining an equitable and an inclusive campus community. These include but are not limited to: race, color, national origin, sex, disability, age, sexual orientation, gender identity, religion, creed, ancestry, belief, veteran status, or genetic information. We at CMU, will work to promote diversity, equity and inclusion not only because it is necessary for excellence and innovation, but because it is just. Therefore, while we are imperfect, we all need to fully commit to work, both inside and outside of our classrooms to increase our commitment to build and sustain a campus community that embraces these core values.

It is the responsibility of each of us to create a safer and more inclusive environment. Incidents of bias or discrimination, whether intentional or unintentional in their occurrence, contribute to creating an unwelcoming environment for individuals and groups at the university. If you experience or observe unfair or hostile treatment on the basis of identity, we encourage you to speak out for justice and support in the moment and/or share your experience using the following resources:

All reports will be acknowledged, documented, and a determination will be made regarding a course of action. All experiences shared will be used to transform the campus climate to be more equitable and just.


Reading lists from Fall 2018, Fall 2019, Fall 2020 and Fall 2021 courses are available on Piazza:


The reading list for Fall 2022 semester will also be shared directly with students and posted on the course website.