I. Introduction
Wireless Federated Learning (FL) enables distributed wireless devices to compute local model updates based on their individual data sets and only send their local model parameters to a designated parameter server, such as a base station (BS) [1], [2], [3]. The BS aggregates and updates the global model parameters, and then sends the global model parameters back to the participating devices. This repeats until convergence. However, limited communication resources in wireless networks can significantly affect the performance of FL, as the number of devices participating in FL can be heavily restricted [4]. To this end, it is important to jointly design device selection and communication resource allocation to facilitate the convergence of FL in resource-constrained wireless networks.