Field of Study:
Knowledge Distillation
Knowledge Distillation (KD) is a technique where a smaller, simpler model (student) is trained to mimic the behavior of a larger, more complex model (teacher). The aim is to transfer the knowledge from the teacher model to the student model. This is particularly useful in NLP tasks where deploying large models is computationally expensive. The student model, being smaller, is more efficient to use but still maintains a high level of performance by learning from the teacher model.
Synonyms:
KD
Papers published in this field over the years:
Hierarchy
Loading...
Publications for Knowledge Distillation
Sort by
Previous
Next
Showing results 1 to 0 of 0
Previous
Next
Researchers for Knowledge Distillation
Sort by