Jethro's Braindump


Co-learning is the technique of aiding of modeling of a (resource-poor) modality by exploiting knowledge from another (resource-rich) modality. The helper modality is only used in model training, and is not used during test-time. (Baltru\vsaitis et al., 2017)

Parallel-data approaches require the data to be directly linked to observations in other modalites. Non-parallel approaches do not require these direct links between modalities. Hybrid-data approaches bridge the modalities through a shared modality, or a dataset.

Parallel data

Co-training is the process of creating more labeled training samples when we have few labeled samples in a multi-modal problem. Weak classifiers are built for each modality to bootstrap each other with labels for the unlabeled data.

Transfer learning exploits co-learning with parallel data, by building multi-modal representations with only some modalities used during test time. Approaches like these include multimodal Deep Boltzmann Machines and Multi-modal Autoencoders.

Non-parallel data

Non-parallel methods only require that different modalities share similar categories or concepts. Methods include transfer learning using coordinated multimodal representations, or Concept Grounding via word similarity, or zero-shot learning.


Baltru\vsaitis, Tadas, Ahuja, C., & Morency, L., Multimodal machine learning: a survey and taxonomy, CoRR, (), (2017).

Icon by Laymik from The Noun Project. Website built with ♥ with Org-mode, Hugo, and Netlify.