1. Introduction
With the emergence of deep learning architectures, the dilemma between regression-based and optimization-based approaches for many computer vision problems has been more relevant than ever. Should we regress the relative camera pose, or use bundle adjustment? Is it more appropriate to regress the parameters of a face model, or fit the model to facial landmarks? These types of questions are ubiquitous within our community. Among others, 3D model-based human pose estimation has initiated similar discussions, since both optimization-based [4], [18] and regression-based approaches [15], [24], [27] have had significant success recently. However, one can argue that both paradigms have weak and strong points ( Figure 1). Based on this, in this work we advocate that instead of focusing on which paradigm is better, if we aim to push the field forward, we need to consider ways for collaboration between the two.
Both optimization and regression approaches have successes and failures, so this motivates our approach to build a tight collaboration between the two.