Data Lake
From Clinfowiki
A data lake is a central repository that allows the storage and flow of structured and unstructured data sources. This concept is akin to a lake with multiple streams or sources to fill up a reservoir and store data as is, before it is allowed to flow out to various applications within an organization.
Contents
Functions of a Data Lake
Data Ingestion
- Tools
Data Storage and Retention
- Tools
Data Processing
- Tools
Data Access
- Tools
Difference from Data Warehouse
Data Swamp
This is when a data lake can become unruly and become a data swamp.
References
https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/
Submitted by Tom Nahass