Using governance to spur, not stall, data access for analytics

Using governance to spur, not stall, data access for analytics
Data governance has been a major bottleneck in analytics. Although it is important to manage data to ensure compliance with regulations and policies, it can also make it difficult for users to locate and access the data. The situation can be even more complicated for businesses that manage data at scale, in real-time, and in the cloud. If governance processes halt the use of real-time data streams, what good is that?

Effective governance should help employees quickly find and use data, enabling them to collaborate and create business value from the organization’s data assets. How can data governance help to accelerate this process, rather than stall it? Some organizations claim they have found a way. Companies can combine the best aspects of both the main data governance models to build governance into the larger analytics framework.

In this way, governance is planned and executed to create competitive advantage, addressing policy compliance, security, accessibility, and usability in a frictionless and comprehensive manner. This increases the availability of data and makes it more accessible to distributed team members. It also keeps risks under control. Although common data governance practices can present challenges for businesses, this blending may be able to overcome those obstacles.

Both data governance models pose challenges

Companies are struggling to manage data at scale and in the cloud. Nearly three quarters (75%) of recent Forrester Research poll respondents said they don’t yet manage all of their data in the cloud. Some 80 percent say they have difficulty governing data at scale. A whopping 82 percent cite forecasting and controlling costs as a challenge in their data ecosystem, and 82 percent say confusing data governance policies are a difficulty.

In the meantime, data companies are managing a growing volume of data, and users are demanding more access. Patrick Barch, Capital One Software’s senior director of product management, says that there is more data coming in from many different sources and stored in more places.

Organizations want to make these data available to more business teams to enable new insights and business value. However, many struggle to balance the need to have central governance of data in cloud, which can limit data access but ensures complete governance. A decentralized model gives business lines more control over data and analytics. However, decentralization has its disadvantages. Different teams might not agree on governance policies. Different data types or data can become locked in silos that are not accessible to everyone. Machine learning engineers might not have access to the data needed to create advanced analytics tools.

” Your teams want instant access to the data and tools they choose,” Barch says. “You can’t manage everything centrally without becoming a huge bottleneck or hiring an army of data engineers, and you can’t completely decentralize the management responsibility without incurring significant data risk.”

Best of both worlds

There is a way, however, to combine centralized and decentralized approaches into a new model of data governance through federation of data management. This allows businesses to reap the benefits of both without sacrificing the drawbacks.

Capital One adopted this model when it shut down its data centers, and moved operations to the public cloud. Although the company created a cloud data warehouse to make data available to business teams, it realized that it had to be careful about data governance.

” Without good governance controls you not only run the risk of poor policy management but also risk spending much more money than you intended, much faster,” Barch says. “We knew that maximizing the value of our data, especially as the quantity and variety of that data scales, was going to require creating integrated experiences with built-in governance that enabled the various stakeholders involved in activities like publishing data, consuming data, governing data and managing the underlying infrastructure, to all seamlessly work together.”

What does this blended approach to data governance look like? It’s what Barch calls “sloped Governance” for Capital One. This allows you to increase access and security controls at all levels of data. Private user spaces, which do not contain shared data, may have very limited data governance requirements. As production becomes more complex, the controls become more stringent and take longer to implement.

Capital One’s solution includes a central shared-services platform that applies governance to different types data using machine learning automation and then validated by humans. The built-in centralized governance rules allow data to flow freely and in a decentralized manner, allowing for quick data access for all lines of business. Not all data is the same; not all data requires the same amount attention,” Barch says. “This solution changes governance from an all-or-nothing approach to one that applies the right level of governance to the right scenarios, based on the level of risk.”

Blended governance approaches deliver results

This blended governance approach provides several benefits. It makes data discovery faster and more efficient. Different data can be categorized in a centralized governance framework. Certain governance levels may require additional metadata fields or a higher level of service. Barch says that this organization and categorizing “helps analysts locate the information faster, which speeds their time to insight and time to value.”

A blended approach allows for more collaboration in design as well as production. Traditional corporate engineering governance standards slow down the process of bringing analytics tools to production. Blended governance models can speed up the process of bringing analytics tools into production. They only apply enough governance and not a full-court press that discourages creativity. Barch refers to a software development model that allows developers to safely collaborate on a shared repository. You don’t want the red tape and standards to prevent your data analysts or scientists from getting their code operationalized

. Finally, the blended model can reduce governance overhead. Effective data governance practices help businesses know where their data is, what their privacy and security policies are. It’s easier for businesses to meet their ever-changing privacy and security policy management goals if governance is part of their analytics framework. Capital One’s implementation includes configurable policies that can easily be modified to meet the needs of different industries.

Organizations can adopt a blended approach to governance, without restricting or slowing down data use. They can encourage collaboration and alignment among teams while also controlling costs and reducing overhead. Barch says, “An analytics platform that has this type of governance means your people can trust the data is being well managed while also enabling teams the ability to operate at the speed and business.”

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by the editorial staff of MIT Technology Review.

Read More