1. Introduction
The success of recent deep learning model is largely attributed to learning the good representations for visual patterns. Such representations essentially facilitate various vision task, such as recognition and synthesis [15], [25], [33]. Nevertheless, one important goal for the vision community is to model and summarize the relationships of observed variables of a system, in order to enable well predictions on similar data. Essentially, it is desirable to understand how the system is changed if one modifies these relationships under certain conditions, e.g., the effects of a treatment in healthcare. Thus this demands the high-level ability of causal reasoning beyond the previous efforts of only learning good representations for visual patterns [1], [5], [16],[21],[35], [37]. This naturally leads into our task of causal inference.