Data Lake Maturity Model

Issue link: https://resources.zaloni.com/i/1078782

Contents of this Issue


Page 18 of 43

might not use it. Security also depends on assigning ownership of each dataset to a particular team or division, which then has the responsibility of approving or denying access to outsiders. Self-Service and the Data-Driven Organization As mentioned at the beginning of this report, business users need data faster than ever. Waiting weeks or months for a programmer to code up a report was until recently the norm. But this is no longer acceptable. Business users have also become more technically savvy, so they have a better idea than in previous generations what kinds of decisions they need data to make and what data they want. Many have become used to writing mini-programs such as Microsoft Excel macros to derive insights from their data. They now want to design queries for the big data their organizations collect. Giving each team and user control over queries can speed up deci‐ sions by orders of magnitude, a speed that every organization needs in order to respond to changes in their environments today. As just one example, continual changes in US tariff regulations over the past year have strained the planning capacities of many companies. You can wake up in the morning and find yourself in a totally differ‐ ent business environment. Self-service requires work at many levels as well as new architectures for data storage: • Potential users need to be able to find data. It's not enough to collect the data; the organization must tag and categorize it and create a data catalog that allows searches and queries. It's also important to have a comprehensive online taxonomy, which is a list of terms and their relationships. In retail sales, for instance, grills might be a subset of appliances and gas grills are a subset of grills. Formalizing relationships like that can help people search for useful datasets. In another retail company, grills might be classified as lawn furniture instead of appliances; that will affect searches. • A process must be in place for giving the users access to the data they want. This might include having a data owner vet the access, copying the data to a new repository, anonymizing or masking sensitive parts of the data, and checking later to make sure the user adheres to the contract provided with the data. User Needs and Data Lake Architectures | 13

Articles in this issue

view archives of eBooks - Data Lake Maturity Model