Loading web-font TeX/Math/Italic
k-Level Truthful Incentivizing Mechanism and Generalized k-MAB Problem | IEEE Journals & Magazine | IEEE Xplore

k-Level Truthful Incentivizing Mechanism and Generalized k-MAB Problem


Abstract:

Multi-armed bandits problem has been widely utilized in economy-related areas. Incentives are explored in the sharing economy to inspire users for better resource allocat...Show More

Abstract:

Multi-armed bandits problem has been widely utilized in economy-related areas. Incentives are explored in the sharing economy to inspire users for better resource allocation. Previous works build a budget-feasible incentive mechanism to learn users’ cost distribution. However, they only consider a special case that all tasks are considered as the same. The general problem asks for finding a solution when the cost for different tasks varies. In this paper, we investigate this problem by considering a system with k levels of difficulty. We present two incentivizing strategies for offline and online implementation, and formally derive the ratio of utility between them in different scenarios. We propose a regret-minimizing mechanism to decide incentives by dynamically adjusting budget assignment and learning from users’ cost distributions. We further extend the problem to a more generalized k-MAB problem by removing the contextual information of difficulties. CUE-UCB algorithm is proposed to address the online advertisement problem for multi-platforms. Our experiment demonstrates utility improvement about 7 times and time saving of 54% to meet a utility objective compared to the previous works in sharing economy, and up to 175% increment of utility for online advertising.
Published in: IEEE Transactions on Computers ( Volume: 71, Issue: 7, 01 July 2022)
Page(s): 1724 - 1739
Date of Publication: 18 August 2021

ISSN Information:

Funding Agency:

References is not available for this document.

1 Introduction

RECENT trends of applying Reinforcement Learning (RL) mechanisms in economy related areas have shed light on better resolutions to these human-involved fields. Economic problems such as sharing economy, incentivizing mechanisms, online advertising, gambling-like problems are highly complicated due to the dynamic and unpredicted nature of the involving humans. The performance of traditionally heuristic algorithms is diminished in face of the varying cases due to human actions. However, the reinforcement learning can explore and exploit the human factors, which automatically provides ongoing solutions while also converges to the best solutions simultaneously via learning the behaviors of the participants dealt with.

Select All
1.
Y. Li, Y. Zheng and Q. Yang, "Dynamic bike reposition: A spatio-temporal reinforcement learning approach", Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, pp. 1724-1733, 2018.
2.
J. Spero, "Ofo shuts international division as staff prepare for bankruptcy", 2019, [online] Available: https://www.ft.com/content/e23a3480–141b-11e9-a581-4ff78404524e.
3.
X. Zhang, Z. Yang, Z. Zhou, H. Cai, L. Chen and X. Li, "Free market of crowdsourcing: Incentive mechanism design for mobile sensing", IEEE Trans. Parallel. Distrib. Syst., vol. 25, no. 12, pp. 3190-3200, 2014.
4.
R. Zhou, Z. Li and C. Wu, "A truthful online mechanism for location-aware tasks in mobile crowd sensing", IEEE Trans. Mob. Comput., vol. 17, no. 8, pp. 1737-1749, 2018.
5.
P. Zhou, C. Wang and Y. Yang, "Self-sustainable sensor networks with multi-source energy harvesting and wireless charging", Proc. IEEE Int. Conf. Comput. Commun., pp. 1828-1836, 2019.
6.
A. Archer and E. Tardos, "Frugal path mechanisms", ACM Trans. Algorithms, vol. 3, no. 1, pp. 1-22, 2007.
7.
Y. Singer, "Budget feasible mechanisms", Proc. 51st IEEE Annu. Symp. Found. Comput. Sci., pp. 765-774, 2010.
8.
A. Singla and A. Krause, "Truthful incentives in crowdsourcing tasks using regret minimization mechanisms", Proc. 22nd Int. Conf. World Wide Web, pp. 1167-1178, 2013.
9.
A. Singla, M. Santoni, G. Bartok, P. Mukerji, M. Meenen and A. Krause, "Incentivizing users for balancing bike sharing systems", Proc. 29th AAAI Conf. Artif. Intell., pp. 723-729, 2015.
10.
P. Zhou, C. Wang, Y. Yang and X. Wei, "E-sharing: Data-driven online optimization of parking location placement for dockless electric bike sharing", Proc. 40th IEEE Int. Conf. Distrib. Comput. Syst., pp. 474-484, 2020.
11.
C. Hirnschall, A. Singla, S. Tschiatschek and A. Krause, "Learning user preferences to incentivize exploration in the sharing economy", Proc. 32th AAAI Conf. Artif. Intell., pp. 2248-2256, 2018.
12.
C. Ho and J. W. Vaughan, "Online task assignment in crowdsourcing markets", Proc. 26th AAAI Conf. Artif. Intell., pp. 45-51, 2012.
13.
G. Goel, A. Nikzad and A. Singla, "Mechanism design for crowdsourcing markets with heterogeneous tasks", Proc. 2nd AAAI Conf. Hum. Comput. Crowdsourcing, pp. 77-86, 2014.
14.
P. Zhou, X. Wei, C. Wang and Y. Yang, "Explore truthful incentives for tasks with heterogenous levels of difficulty in the sharing economy", Proc. 28th Int. Joint Conf. Artif. Intell., pp. 665-671, 2019.
15.
S. Assadi, J. Hsu and S. Jabbari, "Online assignment of heterogeneous tasks in crowdsourcing markets", Proc. 3rd AAAI Conf. Hum. Comput. Crowdsourcing, pp. 12-21, 2015.
16.
W. H. Press, "Bandit solutions provide unified ethical models for randomized clinical trials and comparative effectiveness research", Proc. Nat. Acad. Sci., vol. 106, no. 52, pp. 22387-22392, 2009.
17.
R. B. Myerson, "Optimal auction design", Math. Oper. Res., vol. 6, no. 1, pp. 58-73, 1981.
18.
A. Badanidiyuru, R. Kleinberg and Y. Singer, "Learning on a budget: Posted price mechanisms for online procurement", Proc. 13th ACM Conf. Electron. Commerce, pp. 128-145, 2012.
19.
S. Dobzinski, N. Nisan and M. Schapira, "Truthful randomized mechanisms for combinatorial auctions", Proc. 38th Annu. ACM Symp. Theory Comput., pp. 644-652, 2006.
20.
R. Kleinberg, A. Niculescu-Mizil and Y. Sharma, "Regret bounds for sleeping experts and bandits", Mach. Learn., vol. 80, no. 2, pp. 245-272, 2010.
21.
A. A. Deshmukh, S. Sharma, J. W. Cutler, M. Moldwin and C. Scott, "Simple regret minimization for contextual bandits", 2018.
22.
Z. Liu, Y. Shen and Y. Zhu, "Inferring dockless shared bike distribution in new cities", Proc. 11th ACM Int. Conf. Web Search Data Mining, pp. 378-386, 2018.
23.
A. Badanidiyuru, R. Kleinberg and A. Slivkins, "Bandits with knapsacks", Proc. 54th IEEE Annu. Symp. Found. Comput. Sci., pp. 207-216, 2013.
24.
F. Li, J. Liu and B. Ji, "Combinatorial sleeping bandits with fairness constraints", IEEE Trans. Netw. Sci. Eng., vol. 7, no. 3, pp. 1799-1813, 2019.
25.
Y. Liu and C. Ho, "Incentivizing high quality user contributions: New arm generation in bandit learning", Proc. 32nd AAAI Conf. Artif. Intell., pp. 1146-1153, 2018.
26.
L. Chen, K. Cai, L. Huang and J. Lui, "Beyond the click-through rate: Web link selection with multi-level feedback", Proc. 27th Int. Joint Conf. Artif. Intell., pp. 3308-3314, 2018.
27.
A. Badanidiyuru, J. Langford and A. Slivkins, "Resourceful contextual bandits", Proc. Conf. Learn. Theory, pp. 1109-1134, 2014.
28.
Y. Yue, J. Broder, R. Kleinberg and T. Joachims, "The K-armed dueling bandits problem", J. Comput. Syst. Sci., vol. 78, no. 5, pp. 1538-1556, 2012.
29.
R. Combes, Richard, M. S. T. M. Shahi and A. Proutiere, "Combinatorial bandits revisited", Proc. 29th Conf. Neural Inf. Process. Syst., pp. 2116-2124, 2015.
30.
"How much does YouTube advertising cost?", 2019, [online] Available: https://www.webfx.com/internet-marketing/how-much-does-youtube-advertising-cost.html.
Contact IEEE to Subscribe

References

References is not available for this document.