I. Introduction
In the past decade, various solutions and systems have been proposed to address the research challenges of data lakes. However, while ‘data lake’ is a current buzzword with a lot of hype surrounding it, there is a lot of ambiguity about its exact definition and functions. Moreover, most recent data lake proposals only target a specific research problem or certain types of source data. A coherent, complete picture of data lake problems and solutions is still missing.