SLoB: Suboptimal Load Balancing Scheduling in Local Heterogeneous GPU Clusters for Large Language Model Inference | IEEE Journals & Magazine | IEEE Xplore