This article is an excerpt from "Architecting Data Lakes: Second Edition" by Ben Sharma. Get the full ebook.
Most companies are at the very beginning stages of understanding with respect to optimizing their data storage and analytics platforms. An estimated 70% of the market ignores big data today, and because they use data warehouses, it is tough for them to quickly accommodate business changes. Approximately 20 to 25% of the market stores some of its data in data lakes using scale-out architectures such as Hadoop and Amazon Simple Storage Service (Amazon S3) to more cost-effectively manage big data. However, most of these implementations have turned into data swamps. Data swamps are essentially unmanaged data lakes, so although they still are more cost-effective than data warehouses, they are only really useful for some ad hoc exploratory use cases. Finally, 5 to 10% of the market is using managed, governed data lakes, which allows for energized business insights via a scalable, modern data architecture.
As the mainstream adopters and laggards are playing catch up with big data, today’s innovators are looking at automation, machine learning, and intelligent data remediation to construct more usable, optimized data lakes. Companies such as Zaloni are working to make this a reality.
As the data lake becomes an important part of next-generation data architectures, we see multiple trends emerging based on different vertical use cases that indicate what the future of data lakes will look like.
Get your copy of "Architecting Data Lakes - Second Edition" today and continue reading about logical data lakes, federated queries, enterprise data marketplaces, machine learning and intelligent data lakes, Internet of Things, and more.
About the AuthorFollow on Twitter Follow on Linkedin Visit Website More Content by Ben Sharma