High Performance Natural Language Processing

We need high-performance natural language processing to scale up NLP systems in production.

There are several approaches to achieving speeding up NLP systems.

Knowledge Distillation

TODO DistillBert

TODO MobileBert

Making Transformer Models Efficient

Pruning

Pruning removes “unimportant” weights from a network. E.g. Prune based on second-order derivatives: “Optimal Brain Damage” and “Optimal Brain Surgeon”

TODO Lottery Ticket Hypothesis in Transformer (NO_ITEM_DATA:brixSuccessfullyApplyingStabilized2020)

Bibliography

NO_ITEM_DATA:brixSuccessfullyApplyingStabilized2020

Jethro's Braindump