An Off-Policy Reinforcement Learning-Based Adaptive Optimization Method for Dynamic Resource Allocation Problem | IEEE Journals & Magazine | IEEE Xplore