This paper describes the approach proposed by the D2KLab team for the 2020 RecSys Challenge on the task of predicting user engagement facing tweets. This approach relies on two distinct stages. First, relevant features are learned from the challenge dataset. These features are heterogeneous and are the results of different learning modules such as handcrafted features, knowledge graph embeddings, sentiment analysis features and BERT word embeddings. Second, these features are provided in input to an ensemble system based on XGBoost. This approach, only trained on a subset of the entire challenge dataset, ranked 22 in the final leaderboard.
Two stages approach for tweet engagement prediction
Technical Report, CoRR abs/2008.10419, 24 August 2020
PERMALINK : https://www.eurecom.fr/publication/6324