Sustainable Data Lakes for Extreme-Scale Analytics

SmartDataLake

In the era of Big Data, decision making processes are becoming increasingly data-driven and data-intensive. The Data Lake approach refers to assembling large amounts of diverse data from a multitude of data sources, retaining their original model and format, and allowing users to query and analyze them in situ. Thus, it promises to enable ad hoc, self-service analytics and to reduce the required time from data to insights.

SmartDataLake aims at designing, developing and evaluating novel approaches, techniques and tools for extreme-scale analytics over Big Data Lakes. It tackles the challenges of reducing costs and extracting value from Big Data Lakes by providing solutions for virtualized and adaptive data access; automated and adaptive data storage tiering; smart data discovery, exploration and mining; monitoring and assessing the impact of changes; and empowering the data scientist in the loop through scalable and interactive data visualizations.

The results of the project are evaluated in real-world use cases from the Business Intelligence domain, including scenarios for portfolio recommendation, production planning and pricing, and investment decision making.

Status

Completed

Start Date

01/2019

End Date

12/2021

Type

European

Responsible

Dimitrios Skoutas

Partners

Athena Research and Innovation Center in Information, Communication and Knowledge Technologies

Ecole Polytechnique Federale de Lausanne

Technische Universiteit Eindhoven

Universitat Konstanz

RAW Labs SA

SpazioDati SRL

Spring Techno GMBH & CO KG

Synyo GmbH