NLP-KG
Semantic Search

Publication:

To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks

Matthew E. PetersSebastian RuderNoah A. Smith • @Workshop on Representation Learning for NLP • 14 March 2019

TLDR: The empirical results across diverse NLP tasks with two state-of-the-art models show that the relative performance of fine-tuning vs. feature extraction depends on the similarity of the pretraining and target tasks.

Citations: 393
Abstract: While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task. We focus on the two most common forms of adaptation, feature extraction (where the pretrained weights are frozen), and directly fine-tuning the pretrained model. Our empirical results across diverse NLP tasks with two state-of-the-art models show that the relative performance of fine-tuning vs. feature extraction depends on the similarity of the pretraining and target tasks. We explore possible explanations for this finding and provide a set of adaptation guidelines for the NLP practitioner.

Related Fields of Study

loading

Citations

Sort by
Previous
Next

Showing results 1 to 0 of 0

Previous
Next

References

Sort by
Previous
Next

Showing results 1 to 0 of 0

Previous
Next