Reducing TLB Miss Penalty on GPUs via Unified Multi-level PWB and PWC | IEEE Conference Publication | IEEE Xplore