Data Lake Maturity Model

training at all—but this 46% will end up as a drag on their efforts to become data driven. The executives think that only 10% of their staff will require more than a year of training, which indicates that the company is going to be competing fiercely to hire new people with the skills they need to do analytics. The late-stage challenge is to avoid devolving into the data swamp described earlier. Measures to take include the following: • Making sure your experiments are aimed at clear business value • Finding sponsors among business users as well as a big data champion from executive leadership • Starting to define and document processes at the corporate level for setting up the zones of a data lake, preserving relevant prov‐ enance, defining ownership and sharing responsibilities, and auditing • Calling on managers to use data to evaluate the results of every initiative • Getting the rules and resources to do training and certification for a wide range of staff • Making sure your technology and data infrastructure can scale Success will put you at the next level of the maturity model: Govern. Level 3: Govern At this level, your people, processes, and technologies begin to become coordinated across the enterprise. The company is running at least one scalable production data lake, and is making sure that new datasets are linked into this data lake instead of being stored in silos in individual departments. Departments are working together to standardize and centralize their data, so a governance organiza‐ tion is present. Sometimes a single data lake is too difficult to manage, so there might be more than one spread across the enterprise. But teams are working together to make sure each team has access to all the data to which it is entitled. To facilitate access, the company might have a data catalog, although perhaps it is not as powerful and rich as it should be. 26 | The Data Lake Maturity Model

