Jethro's Braindump

High Performance Natural Language Processing

We need high-performance natural language processing to scale up NLP systems in production.

There are several approaches to achieving speeding up NLP systems.

Knowledge Distillation

TODO DistillBert

TODO MobileBert

Making Transformer Models Efficient


Pruning removes “unimportant” weights from a network. E.g. Prune based on second-order derivatives: “Optimal Brain Damage” and “Optimal Brain Surgeon”

TODO Lottery Ticket Hypothesis in Transformer (NO_ITEM_DATA:brixSuccessfullyApplyingStabilized2020)



Links to this note