Zero-shot Policy Learning with Spatial Temporal Reward Decomposition on Contingency-aware Observation | IEEE Conference Publication | IEEE Xplore