New York, NY – Strata + Hadoop World, Booth #434 – September 30, 2015 – Zaloni, a leading provider of Hadoop-based enterprise data management solutions, today announced version 1.0 of their new Mica Platform at Strata + Hadoop World. The Mica Platform is a self-service data preparation platform that enables data cataloging, data discovery, data curation, interactive and visual data preparation, and collaboration for customers who want to democratize access to the Hadoop Data Lake.
According to Gartner, “Data preparation is one of most difficult and most time-consuming challenges facing users of BI and data discovery tools, as well as data scientists. The unprecedented growth in multi-structured data has contributed to greater effort required in preparing the data to support decision-making processes.” 1 Zaloni’s introduction of Mica addresses this gap.
Mica 1.0 provides Data Scientists and Business Analysts the tools they need for interactive and visual data preparation. With Mica, business analysts, data scientists, and IT teams can collaborate and effectively govern the end-to-end data management process, using a self-service model, while remaining confident in the quality and integrity of the data.
The Mica 1.0 Platform sits on top of Zaloni’s Bedrock Data Management Platform. Bedrock is the only platform built exclusively for Hadoop that offers a fully integrated, single-software solution for managed data ingestion, organization, enrichment, and extraction. Bedrock is distribution-agnostic and delivers a scalable, manageable, cost-effective, and rapidly deployable solution for building and managing a Hadoop Data Lake.
Mica leverages Bedrock for data organization, metadata management, and execution of transformation logic on the Hadoop cluster. Once rules have been established within Bedrock, those same rules apply to how data are governed and managed within Mica. Business users work within an easy-to-use interface to bring new data into the Bedrock Data Management Platform, or to shape and refine data that already exist within Bedrock. Business users have greater access. By reducing the time spent on data preparation tasks, the time to data analytics and actionable insight is dramatically reduced.
The Mica 1.0 Platform includes the following functionality:
- Enterprise-wide data catalog – easy-to-use exploration, search, and preview – a detailed view of business and technical information about each data set; free-form text search across a variety of dimensions; and the ability to export metadata results. The catalog is built from datasets that may reside in multiple clusters on-premise, in the cloud, or in a hybrid architecture.
- Data discovery – data profiling and tagging provides users an opportunity to interact with data to detect patterns and identify outliers.
- Interactive data preparation and transformation – visual data-driven refinements that operate on sample datasets for iterative refinements. Transformation supports data sorting and cell level editing, such as splitting, clustering and data transposing. The sample-transformed data can be operationalized across the entire data set in Bedrock and saved as TSV, CSV, HTML table, Excel, ODF, Delimited Files or Tabular format.
- Execution and scheduling – data transformations designed in Mica can be executed and scheduled on complete data set using Bedrock with proper lineage captured and workload isolation based on YARN.
- Data catalog curation – allows customers to contribute valuable business information; to search entities reliably; and to add tags quickly and easily to entities.
- Workspaces and user collaboration – adds the ability to create workspaces for organizing and sharing data assets and enrichments with peers via a filtered group of entities.
What’s Next for Mica
The next release of Mica will include data profiling, enhanced and intelligent data preparation, and deeper integration with Bedrock.
Zaloni: “The need for self-service data integration and data preparation is driven in large part by the small number of data scientists relative to the demands on them by business use cases across the enterprise,” said Ben Sharma, CEO, Zaloni. “A self-service platform, like Mica, provides a greater degree of automation, autonomy, and a much faster way to integrate data between disparate sources across a variety of locations. The level of empowerment a business analyst achieves is significant, and results in faster-time-to data analytics and actionable insights, which ultimately leads to more rapid decision-making and increased revenue. Self-service also brings a renewed emphasis on governance, which reduces common risks involved with a self-service model.”
MapR: “As the enterprise moves beyond Hadoop POC’s and into production, new challenges arise, including challenges around data governance and data security. When you introduce self-service into the equation, the risks become even greater,” said John Schroeder, CEO, MapR: “MapR, together with Zaloni and their new Mica platform, are helping customers deploy self-service solutions with the highest levels of governance and security, while providing a solution that lets business users access, manipulate, prepare and analyze the data they need, when they need it.”
Ovum: “Organizations have always been challenged understanding, cleansing, and correlating the data from ‘familiar’ internal sources such as disparate transaction systems. The challenge is multiplied with Big Data,” said Tony Baer, Principal analyst with Ovum. “Business users are going to have to take the initiative in curating data if they are to gain value and insight from Big Data. Zaloni has put together a solution that provides a self-service approach for users to take charge over their data destiny.”
1 Gartner, Market Guide for Self-Service Data Preparation for Analytics, Lakshmi Randall et al., March 5, 2015