Code for Tackling Sparse Interactions In Multimodal Session-Based Recommendation.
With the advancement of multimodal technology, multimodal session-based recommendation (SBR) has become increasingly significant. However, existing multimodal SBR systems face two main challenges: (1) noisy modal data in sparse interactions which exerts negative impact, (2) underutilized interaction information, restricting the model's ability to capture comprehensive representations from multiple perspectives. To tackle these issues, we propose a Multimodal Hypergraph Fusion Network for Session-based Recommendation (MHFNSR). Specifically, we mitigate modal noise through cooperation between collaborative and multimodal views to refine data quality. Moreover, we design a global hypergraph to fully excavate insightful high-order relations. Then we adaptively fuse representations according to session patterns for personalized recommendation. Finally, we present hard sample-aware contrastive learning to complement self-supervised signals by delineating finer-grained associations between modalities, thereby alleviating the negative effects of sparse interaction signals. Extensive experiments on three real-world datasets demonstrate that MHFNSR outperforms existing methods and effectively alleviates the cold-start challenge.
We present the overview of MHFNSR, several modules including the reciprocal refinement module, global hypergraph learning, adaptive modality fusion, and hard sample aware contrastive learning are described in detail in the paper.
The main contributions of this work can be summarized as follows:
- We denoise sparse interaction data through reciprocal refinement and fully exploit valuable information via multimodal hypergraph to mine robust global high-order interests in magnified sparse scenarios.
- We propose an adaptive fusion strategy that adjusts automatically according to dynamic session characteristics and uncover finer-grained correlations among modalities via hard sample aware contrastive learning.
- Experimental results on three real-world datasets validate the effectiveness of our proposed MHFNSR over various state-of-the-art methods.