Jethro's Braindump

Transformer

Transformer are model architectures that have proven effective across range of domains such as Natural Language Processing and Computer Vision.

These models are identified by the self-attention mechanism.