Journals & Magazines >IEEE Transactions on Big Data >Volume: 9 Issue: 6

Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Historical interactions are the default choice for recommender model training, which typically exhibit high sparsity, i.e., most user-item pairs are unobserved missing da...Show More

Metadata

Abstract:

Historical interactions are the default choice for recommender model training, which typically exhibit high sparsity, i.e., most user-item pairs are unobserved missing data. A standard choice is treating the missing data as negative training samples and estimating interaction likelihood between user-item pairs along with the observed interactions. In this way, some potential interactions are inevitably mislabeled during training, which will hurt the model fidelity, hindering the model to recall the mislabeled items, especially the long-tail ones. In this work, we investigate the mislabeling issue from a new perspective of aleatoric uncertainty, which describes the inherent randomness of missing data. The randomness pushes us to go beyond merely the interaction likelihood and embrace aleatoric uncertainty modeling. Towards this end, we propose a new Aleatoric Uncertainty-aware Recommendation (AUR) framework that consists of a new uncertainty estimator along with a normal recommender model. According to the theory of aleatoric uncertainty, we derive a new recommendation objective to learn the estimator. As the chance of mislabeling reflects the potential of a pair, AUR makes recommendations according to the uncertainty, which is demonstrated to improve the recommendation performance of less popular items without sacrificing the overall performance. We instantiate AUR on three representative recommender models: Matrix Factorization (MF), LightGCN, and VAE from mainstream model architectures. Extensive results on four real-world datasets validate the effectiveness of AUR w.r.t. better recommendation results, especially on long-tail items.

Published in: IEEE Transactions on Big Data ( Volume: 9, Issue: 6, December 2023)

Page(s): 1607 - 1619

Date of Publication: 01 August 2023

ISSN Information:

DOI: 10.1109/TBDATA.2023.3300547

Funding Agency:

Contents

I. Introduction

Recommender systems play an irreplaceable role in various online platforms [1], [2], [3], which aim to facilitate information seeking by providing personalized services. A canonical paradigm is solving recommendation as a machine learning problem to model the interaction likelihood between user-item pairs for making recommendations. A de facto standard is learning the recommender model from historical interactions, which however suffers from severe data sparsity issues [4]. The ratio of missing data, i.e., user-item pairs lacking the label of interaction, can reach 99% in many practical cases such as e-commerce [2] and social media [3] due to the huge size of candidate item set which typically increases over time. Worse still, the historical interactions are unevenly distributed over items where long-tail items encounter more missing data, leading to notorious issues like popularity bias [5]. Therefore, it is essential to properly account for the missing data in recommender training.

References is not available for this document.

Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Rethinking Missing Data: Aleatoric Uncertainty-Aware Recommendation

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

Funding Agency:

I. Introduction

References