Leveraging Multimodal Shapley Values to Address Multimodal Collapse and Improve Fine-Grained E-Commerce Product Classification

Ajibola Obayemi, Khuong An Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Downloads (Pure)

Abstract

Multimodal models can experience multimodal collapse, leading to sub-optimal performance on tasks like fine-grained e-commerce product classification. To address this, we introduce an approach that leverages multimodal Shapley values (MM-SHAP) to quantify the individual contributions of each modality to the model's predictions. By employing weighted stacked ensembles of unimodal and multimodal models, with weights derived from these Shapley values (MM-SHAP), we enhance the overall performance and mitigate the effects of multimodal collapse. Using this approach we improve previous results (F1-score) from 0.67 to 0.79.
Original languageEnglish
Title of host publication2024 International Conference on Machine Learning and Applications (ICMLA)
PublisherIEEE
ISBN (Electronic)979-8-3503-7488-9
DOIs
Publication statusPublished - 4 Mar 2025

Cite this