Fela: Incorporating Flexible Parallelism and Elastic Tuning to Accelerate Large-Scale DML | IEEE Conference Publication | IEEE Xplore