Data Lake

From Clinfowiki
Revision as of 19:11, 26 October 2020 by Nahata5 (Talk | contribs)

Jump to: navigation, search

A data lake is a central repository that allows the storage and flow of structured and unstructured data sources. This concept is akin to a lake with multiple streams or sources to fill up a reservoir and store data as is, before it is allowed to flow out to various applications within an organization.

Functions of a Data Lake

Data Ingestion

  • Tools

Data Storage and Retention

  • Tools

Data Processing

  • Tools

Data Access

  • Tools


Difference from Data Warehouse

Data Swamp

This is when a data lake can become unruly and become a data swamp.


References

https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/

Submitted by Tom Nahass