Exploratory training: when trainers learn

Habibelahian, Omeed; Shrestha, Rajesh; Termehchy, Arash; Papotti, Paolo

HILDA 2022, Workshop on Human-In-the-Loop Data Analytics, co-located with SIGMOD, 12 June 2022, Philadelphia, PA, USA

Data systems often present examples and solicit labels from users to learn a target concept in supervised to semi-supervised learning. This selection of examples could be even done in an active fashion i.e., active learning. Current systems assume that users always provide correct labeling with potentially a fixed and small chance of mistake. In several settings, users may have to explore and learn about the underlying data to label examples correctly, particularly for complex target concepts and models. For example, to provide accurate labeling for a model of detecting noisy or abnormal values, users might need to investigate the underlying data to understand typical and clean values in the data. As users gradually learn about the target concept and data, they may revise their labeling strategies. Due to the significance and non-stationarity of errors in this setting, current systems may use incorrect labels and learn inaccurate models from the users. We report preliminary results for a user study over real-world datasets on modeling human learning during training the system and layout the next steps in this investigation.

Detail

Document

DOI

BIBTEX

Type:

Conference

City:

Philadelphia

Date:

2022-06-12

Department:

Data Science

Eurecom Ref:

7050

© ACM, 2022. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in HILDA 2022, Workshop on Human-In-the-Loop Data Analytics, co-located with SIGMOD, 12 June 2022, Philadelphia, PA, USA https://doi.org/10.1145/3546930.3547500