Reader small image

You're reading from  Developing Kaggle Notebooks

Product typeBook
Published inDec 2023
Reading LevelIntermediate
PublisherPackt
ISBN-139781805128519
Edition1st Edition
Languages
Right arrow
Author (1)
Gabriel Preda
Gabriel Preda
author image
Gabriel Preda

Dr. Gabriel Preda is a Principal Data Scientist for Endava, a major software services company. He has worked on projects in various industries, including financial services, banking, portfolio management, telecom, and healthcare, developing machine learning solutions for various business problems, including risk prediction, churn analysis, anomaly detection, task recommendations, and document information extraction. In addition, he is very active in competitive machine learning, currently holding the title of a three-time Kaggle Grandmaster and is well-known for his Kaggle Notebooks.
Read more about Gabriel Preda

Right arrow

Transformer-based solution

At the time of the competition, BERT and some other Transformer models were already available and a few solutions with high scores were provided. Here, we will not attempt to replicate them but we will just point out the most accessible implementations.

In Reference 20, Qishen Ha combines a few solutions, including BERT-Small V2, BERT-Large V2, XLNet, and GPT-2 (fine-tuned models using competition data included as datasets) to obtain a 0.94656 private leaderboard score (late submission), which would put you in the top 10 (both gold medal and prize area for this competition).

A solution with only the BERT-Small model (see Reference 21) will yield a private leaderboard score of 0.94295. Using the BERT-Large model (see Reference 22) will result in a private leaderboard score of 0.94388. Both these solutions will be in the silver medal zone (around places 130 and 80, respectively, in the private leaderboard, as late submissions).

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Developing Kaggle Notebooks
Published in: Dec 2023Publisher: PacktISBN-13: 9781805128519

Author (1)

author image
Gabriel Preda

Dr. Gabriel Preda is a Principal Data Scientist for Endava, a major software services company. He has worked on projects in various industries, including financial services, banking, portfolio management, telecom, and healthcare, developing machine learning solutions for various business problems, including risk prediction, churn analysis, anomaly detection, task recommendations, and document information extraction. In addition, he is very active in competitive machine learning, currently holding the title of a three-time Kaggle Grandmaster and is well-known for his Kaggle Notebooks.
Read more about Gabriel Preda