TwitPersonality: Computing personality traits from tweets using word embeddings and supervised learning

Carducci, Giulio; Rizzo, Giuseppe; Monti, Diego; Palumbo, Enrico; Morisio, Maurizio
Information, Vol.9, N°5, May 2018

We are what we do, like, and say. Numerous research efforts have been pushed towards the automatic assessment of personality dimensions relying on a set of information gathered from social media platforms such as list of friends, interests of musics and movies, endorsements and likes an individual has ever performed. Turning this information into signals and giving them as inputs to supervised learning approaches has resulted in being particularly effective and accurate in computing personality traits and types. Despite the demonstrated accuracy of these approaches, the sheer amount of information needed to put in place such a methodology and access restrictions make them unfeasible to be used in a real usage scenario. In this paper, we propose a supervised learning approach to compute personality traits by only relying on what an individual tweets about publicly. The approach segments tweets in tokens, then it learns word vector representations as embeddings that are then used to feed a supervised learner classifier. We demonstrate the effectiveness of the approach by measuring the mean squared error of the learned model using an international benchmark of Facebook status updates. We also test the transfer learning predictive power of this model with an in-house built benchmark created by twenty four panelists who performed a state-of-the-art psychological survey and we observe a good conversion of the model while analyzing their Twitter posts towards the personality traits extracted from the survey.

Data Science
Eurecom Ref: