As companies are modernizing their data architectures and moving from traditional data warehouses to data lakes, many are looking at cloud-based data lakes for their scalability and cost effectiveness. A leading pharmaceutical company wanted to create a next generation data architecture by migrating their current data warehouse to the cloud and augment it with a cloud-based data lake on AWS. A key requirement was centralized management and governance of data across both the cloud-based data warehouse and data lake to create a next generation data platform.
Challenge: The company wanted to migrate to their current data warehouse to cloud and build a new data lake in the cloud to create their next generation data platform but were running into trouble trying to build the platform internally due to the complexity of the big data ecosystem. They needed an unified solution that could manage both their cloud data warehouse and data lake.
Solution: Zaloni helped the company create their next generation data platform by building a data lake with Amazon EMR and used Amazon Redshift to serve as their data warehouse. With the Zaloni Data Platform (ZDP) the company was able to ingest, manage, and govern their next generation platform while providing role-based access to business users through an enterprise-wide data catalog. The following diagram outlines the reference architecture the company used for a cloud-based data lake. Note how data flows through the AWS components and the Zaloni Data Platform thereby creating an cloud-based data platform that is governed and scalable.
Results: The company built and deployed a governed enterprise-wide next generation data platform that was flexible and responsive while maintaining minimal operational costs. Through the self-service data catalog, the company enabled and empowered a diverse group of stakeholders within the organization to leverage data assets for actionable insights.