Waste Not: Offload DW Storage and ETL to Hadoop

February 4, 2016 Ben Sharma

The data warehouse (DW) is still an effective tool for complex data analytics and it isn’t going anywhere. Not soon, anyway. But DWs are expensive. So why are you clogging your DW’s analytics bandwidth with less-valuable storage and ETL processing? 

Migrating storage and large-scale or batch processing to Hadoop lets both the DW and Hadoop do what they do best. Hadoop’s parallel processing, scalable and cost-efficient platform allows enterprises to save on storage and processing costs. The DW, now with more available processing power, can be focused on business intelligence (BI) activities. A bonus: for savvy enterprises with an eye to the future, migrating to Hadoop sets them up to successfully exploit original raw data of all types for data exploration and new use cases.

 Here are four good reasons for a DW offload to Hadoop:

1. Save millions in storage costs

Hadoop can store raw data in any format at a fraction of the cost of the DW. In fact, we helped one client achieve 20 times the storage capacity of their DW at 50% of the cost of a previously planned DW upgrade. Another client achieved a 100x cost reduction per terabyte of stored data.

 2. Significantly speed up processing

Hadoop’s flexible architecture enables faster loading of data and parallel processing, resulting in faster time to insight. For example, one of our clients quadrupled the throughput of their system after migrating processing to Hadoop. Hadoop is also much more effective than the DW for processing the increasing amount of unstructured and semi-structured data that’s important for analytics today.

3. Maximize DW for BI

Costly DW resources shouldn’t be wasted on low-value activities such as data transformation. One of our clients realized that 90% of their DW platform was being used for ETL processes, leaving little processing power available for high-value analytics and business intelligence activities. A DW offload to Hadoop made it possible for the enterprise to use its assets more strategically.

4. Extract more value from all data

Lower cost means enterprises can store more data in an accessible format—in an “active archive” versus on tape. Extending data retention periods for historical data and eliminating time-consuming backup processes supports more in-depth trend analyses that can lead to further business insights and more effective business strategies.

The big picture: Hadoop beyond the EDW offload

Offloading your DW’s storage and processing to Hadoop is a good first step towards the future of where big data architecture is headed—and an offload will produce immediate and significant ROI. However, thinking more broadly, a DW offload also can serve as a launch pad to begin to consider what other ways a hybrid architecture could benefit your business. With cloud solutions you can further separate your compute and storage needs, allowing on-demand analytics that use compute power only when required.  Could Hadoop become a strategic core component of an enterprise data hub? What infrastructure would you need to put in place to make it a reality? How would you manage metadata and data governance across multiple platforms? It’s exciting stuff—and we work with clients every day to make it happen. Please contact us if you’d like to know more.

About the Author

Ben Sharma

Ben Sharma, is CEO and co-founder of Zaloni. He is a passionate technologist with experience in business development, solutions architecture, and service delivery of big data, analytics and enterprise infrastructure solutions. Having previously worked in management positions for NetApp, Fujitsu and others, Ben’s expertise ranges from business development to production deployment in a wide array of technologies including Hadoop, HBase, databases, virtualization and storage. Ben is the co-author of Java in Telecommunications and holds two patents. He received his MS in Computer Science from the University of Texas at Dallas.

Follow on Twitter Follow on Linkedin Visit Website More Content by Ben Sharma
Previous Article
The Importance of a Foundational Data Management Platform for Health and Life Sciences

An Introduction to Accountable Care Organizations (ACO's) and Health Data Exchange  Coordinating data acro...

Next Article
The "New" Notebook

One function of a data scientist is to interpret data for business users.  A common way to go about doing t...


Get a custom demo for your team.

First Name
Last Name
Phone Number
Job Title
Comments - optional
I would like to subscribe to email updates about content and events
Zaloni is committed to the best experience for you. Read more on our Privacy Policy.
Thank you! We'll be in touch!
Error - something went wrong!