System-level fault-tolerance in large-scale parallel machines with buffered coscheduling | IEEE Conference Publication | IEEE Xplore