Sustainable Data Lakes for Extreme-Scale Analytics


Sustainable Data Lakes for Extreme-Scale Analytics
In the era of Big Data, decision making processes are becoming increasingly data-driven and data-intensive. The Data Lake approach refers to assembling large amounts of diverse data from a multitude of data sources, retaining their original model and format, and allowing users to query and analyze them in situ. Thus, it promises to enable ad hoc, self-service analytics and to reduce the required time from data to insights.
SmartDataLake aims at designing, developing and evaluating novel approaches, techniques and tools for extreme-scale analytics over Big Data Lakes. It tackles the challenges of reducing costs and extracting value from Big Data Lakes by providing solutions for virtualized and adaptive data access; automated and adaptive data storage tiering; smart data discovery, exploration and mining; monitoring and assessing the impact of changes; and empowering the data scientist in the loop through scalable and interactive data visualizations.
The results of the project are evaluated in real-world use cases from the Business Intelligence domain, including scenarios for portfolio recommendation, production planning and pricing, and investment decision making.
Start Date
End Date
Dimitrios Skoutas
Athena Research and Innovation Center in Information, Communication and Knowledge Technologies
Ecole Polytechnique Federale de Lausanne
Technische Universiteit Eindhoven
Universitat Konstanz
SpazioDati SRL
Spring Techno GMBH & CO KG
Synyo GmbH