PROMID 2025, 17th Forum of Information Retrieval and Evaluation (FIRE), Shared Task: Misinformation Detection and Prompt Recovery, 17-20 December 2025, Varanasi, India
Misinformation on social media poses a significant challenge during major geopolitical events, where rapid dissemination of misleading content can distort public understanding. The PROMID 2025 Subtask 3 focuses on identifying misinformation in tweets related to the 2022 Russo-Ukrainian conflict, a task complicated by extreme
class imbalance, multilingual content, and heterogeneous metadata. In the provided dataset, misinformation accounts for only 1.05% of all tweets, making it difficult for transformer-based models to learn generalizable patterns. To address this challenge, we evaluate two approaches that both rely on the RoBERTa-large transformers based model: a baseline model trained solely on the original PROMID dataset, and an augmented model that incorporates an additional 5,022 Ukraine-related misinformation tweets coming from the Fact-checking Observatory (FCO). Our results show that while the baseline model achieves high precision, it performs poorly in recall due to
overfitting on the limited misinformation examples. In contrast, the augmented model substantially improves misinformation detection, increasing F1-score from 0.4682 to 0.6967 and weighted F1 from 0.8516 to 0.9059. Our findings demonstrate that targeted data augmentation is an effective strategy for mitigating severe class imbalance
and enhancing generalization in misinformation detection tasks. This constitutes the ClimateSense approach in the public leaderboard that was ranked 1st on the final test set of the PROMID 2025 Subtask 31. Our approach is fully reproducible using the code at https://github.com/climatesense-project/promid2025-task3.
Type:
Conference
City:
Varanasi
Date:
2025-12-17
Department:
Data Science
Eurecom Ref:
8535
Copyright:
CEUR
See also: