Further reading
- Vladislav Mosin, Igor Samenko, Alexey Tikhonov, Borislav Kozlovskii, and Ivan P. Yamshchikov, 2021, Fine-Tuning Transformers: Vocabulary Transfer: https://arxiv.org/abs/2112.14569
- Yi Tay, Mostafa Dehghani, Jinfeng Rao, William Fedus, Samira Abnar, Hyung Won Chung, Sharan Narang, Dani Yogatama, Ashish Vaswani, and Donald Metzler, 2022, Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers: https://arxiv.org/abs/2109.10686
Join our community on Discord
Join our community’s Discord space for discussions with the authors and other readers: