References
- Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyano, 2019, RoBERTa: A Robustly Optimized BERT Pretraining Approach: https://arxiv.org/abs/1907.11692
- Hugging Face Tokenizer documentation: https://huggingface.co/transformers/main_classes/tokenizer.html?highlight=tokenizer
- The Hugging Face reference notebook: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb
- The Hugging Face reference blog: https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb
- More on BERT: https://huggingface.co/transformers/model_doc/bert.html
- More DistilBERT: https://arxiv.org/pdf/1910.01108.pdf
- More on RoBERTa: https://huggingface.co/transformers/model_doc/roberta.html
- Even more on DistilBERT: https://huggingface.co/transformers/model_doc/distilbert.html