High Performance Natural Language Processing
We need high-performance natural language processing to scale up NLP systems in production.
There are several approaches to achieving speeding up NLP systems.
Knowledge Distillation
TODO DistillBert
TODO MobileBert
Making Transformer Models Efficient
Pruning
Pruning removes “unimportant” weights from a network. E.g. Prune based on second-order derivatives: “Optimal Brain Damage” and “Optimal Brain Surgeon”
TODO Lottery Ticket Hypothesis in Transformer (NO_ITEM_DATA:brixSuccessfullyApplyingStabilized2020)
Bibliography
NO_ITEM_DATA:brixSuccessfullyApplyingStabilized2020