Jethro's Braindump

Multi-modal Machine Learning

Definitions

modality
the way in which something happens or is experienced. Typically associated with sensory modalities.
multi-modal models
models that can process and relate information from multiple modalities (e.g. speech and vision).

Key Challenges

representation
learning how to represent and summarize multimodal data in a way that exploits the complementarity and redundancy of multiple modalities (see Multi-modal Representation)
translation
how to map data from one modality to another (see Multi-modal Translation)
alignment
identifying the direct relationships between sub-elements of two or more different modalities (see Multi-modal Alignment)
fusion
Join information from two ore more modalities to perform a prediction (with possibly missing data from modalities). Different modalities may have varying predictive power and noise topology. (see Multi-modal Fusion)
co-learning
Transferring knowledge between modalities, their representation, and predictive models. (see Co-learning)

<biblio.bib>