The contributions presented in this work are two-fold. We first provide a general overview of explanations and interpretable machine learning, making connections with different fields, including sociology, psychology, and philosophy, introducing a taxonomy of popular explainability approaches and evaluation methods. We subsequently focus on rule learning, a specific family of transparent models, and propose a novel rule-based classification approach, based on monotone Boolean function synthesis: LIBRE. LIBRE is an ensemble method that combines the candidate rules learned by multiple bottom-up learners with a simple union, in order to obtain a final intepretable rule set. Our method overcomes most of the limitations of state-of-the-art competitors: it successfully deals with both balanced and imbalanced datasets, efficiently achieving superior performance and higher interpretability in real datasets.
Interpretability of data representations constitutes the second broad contribution to this work. We restrict our attention to disentangled representation learning, and, in particular, VAE-based disentanglement methods to automatically learn representations consisting of semantically meaningful features. Recent contributions have demonstrated that disentanglement is impossible in purely unsupervised settings. Nevertheless, incorporating inductive biases on models and data may overcome such limitations. We present a new disentanglement method - IDVAE - with theoretical guarantees on disentanglement, deriving from the employment of an optimal exponential factorized prior, conditionally dependent on auxiliary variables complementing input observations. We additionally propose a semi-supervised version of our method. Our experimental campaign on well-established datasets in the literature shows that IDVAE often beats its competitors according to several disentanglement metrics.