HyperMM: Robust multimodal learning with varying-sized inputs

Chaptoukaev, Hava; Marcianó, Vincenzo; Galati, Francesco; Zuluaga, Maria A.

MMMI 2024, 5th International Workshop on Multiscale and Multimodal Medical Imaging (MMMI/ML4MHD), In conjunction with 27th Medical Image Computing and Computer-Assisted Intervention Conference (MICCAI), 6-10 October 2024, Marrakesh, Morocco

Best Presentation Award

Combining multiple modalities carrying complementary in-formation through multimodal learning (MML) has shown considerable benefits for diagnosing multiple pathologies. However, the robustness of multimodal models to missing modalities is often overlooked. Most works assume modality completeness in the input data, while in clinical prac-tice, it is common to have incomplete modalities. Existing solutions that address this issue rely on modality imputation strategies before using su-pervised learning models. These strategies, however, are complex, compu-tationally costly and can strongly impact subsequent prediction models. Hence, they should be used with parsimony in sensitive applications such as healthcare. We propose HyperMM, an end-to-end framework designed for learning with varying-sized inputs. Specifically, we focus on the task of supervised MML with missing imaging modalities without using im-putation before training. We introduce a novel strategy for training a universal feature extractor using a conditional hypernetwork, and pro-pose a permutation-invariant neural network that can handle inputs of varying dimensions to process the extracted features, in a two-phase task-agnostic framework. We experimentally demonstrate the advantages of our method in two tasks: Alzheimer’s disease detection and breast cancer classification. We demonstrate that our strategy is robust to high rates of missing data and that its flexibility allows it to handle varying-sized datasets beyond the scenario of missing modalities. We make all our code and experiments available at github.com/robustml-eurecom/hyperMM.

Detail

Document

DOI

HAL

BIBTEX

Type:

Conférence

City:

Marrakesh

Date:

2024-10-06

Department:

Data Science

Eurecom Ref:

7803

© Springer. Personal use of this material is permitted. The definitive version of this paper was published in MMMI 2024, 5th International Workshop on Multiscale and Multimodal Medical Imaging (MMMI/ML4MHD), In conjunction with 27th Medical Image Computing and Computer-Assisted Intervention Conference (MICCAI), 6-10 October 2024, Marrakesh, Morocco and is available at : https://doi.org/10.1007/978-3-031-84525-3_15