9.6 C
Sunday, May 26, 2024

Data Governance Models of Tech Giants, What You Need to Learn

Must read

Insights into how the giants use automated controls, AI integration, and iterative approaches to ensure data quality, compliance, and strategic advantage.

Organizations need a solid and effective data governance policy with many data breaches and privacy issues. Data governance practices are essential to ensure data is optimized for use. There’s a mandate among global tech behemoths to put into practice a data governance policy that restricts access to data and governs which information should be made available and to whom. Organizations that aim to be like IBM or Microsoft need to add data governance to their strategy to enable them to move forward while managing risk at an acceptable level. 

The goal should be to keep the flow going so that the data can drive value. Whether innovation for new products, insights around processes, or reducing cycle time, it’s all about keeping that flow going sufficiently so an organization attains its targets while minimizing risk. Communication and prioritization are also essential, no matter the size of your team. Companies must also understand the sensitivity of the data, how it’s protected and managed, and why it’s collected.

A thorough understanding of data can help organizations prioritize specific data collection, make better decisions, scale efficiently, and save money. 

Data governance best practices are evolving rapidly, and only by keeping your finger on the pulse of the data industry can you prepare your governance strategy to succeed. 

Here, we examine how big tech companies introduce scalable, automated controls and leverage modern foundations to transform data governance. 


Microsoft embraces and promotes a data culture mindset. Rather than viewing data governance as a blocking function, Microsoft sees data governance modernization as a way to democratize data responsibly to power the broader digital transformation. Microsoft is building its data governance controls into the centralized analytics infrastructure and analytics processes. “We are transforming how we provide data governance to introduce scalable, automated controls for data architecture, lifecycle health, and advancing its appropriate use,” Microsoft wrote in its blog post.

Its data governance strategy is developed with five goals in mind:

  • Reduce data duplication and sprawl by building a single Enterprise Data Lake (EDL) for high-quality, secure, and trusted data.
  • Connect data from disparate silos in a way that creates opportunities to use that data in ways not possible in a siloed approach.
  • Power responsible data democratization across Microsoft.
  • Drive efficiency gains in Microsoft’s processes to gather, manage, access, and use data.
  • Meet or exceed compliance and regulatory requirements without compromising Microsoft’s ability to create exceptional products.

Its approach to modern data governance has two key components. First, Microsoft embedded clear data standards and built them into its application development process. This move helps it automate and proactively manage data governance issues and policy compliance. Second, it leverages the EDL platform to centralize and systemically scan and monitor the data.

Microsoft’s data governance framework helps organizations better understand the data protocols, aligning data strategy with business goals and outcomes, and how to secure data as it is rapidly moved into the cloud.


To keep pace, organizations need to make privacy decisions in real time. That takes automation. The strategy that IBM embarked on has been helpful –– enabling it to scale rapidly to address new regulations –– having an end-to-end flow supported intensively by automated processes and artificial intelligence. Essentially, privacy regulation ensures that organizations can safeguard the personally identifiable (PI) data they collect. 

IBM, which has robust data governance models and is one of the biggest vendors of data governance solutions, infuses AI into every business process. In addition, it has a central data & AI platform across the company. It started leveraging it for privacy – building a governance framework to deliver actionable information in real time while ensuring regulatory compliance.

As a quality control discipline for introducing rigor and discipline to managing, using, improving, and protecting organizational data, the IBM data governance model can significantly improve the quality and integrity of the company’s data through inter-organizational collaboration and policy-making. 

According to the IBM model, the core disciplines outlined are Data Quality Management, Information Security and privacy, and Information Life-Cycle Management, and the supporting disciplines are Data Architecture, Audit Information and logging, and Classification and metadata. The key enablers are organizational structure, awareness, policy and stewardship.

The other critical technical aspect that IBM prioritizes is its hybrid cloud framework.

A hybrid cloud environment gives increased flexibility concerning regulatory compliance through a mix of predefined policies and run-time automation coupled with AI.


In a privacy conference, Apple CEO Tim Cook said that achieving great data governance standards is “not only a possibility, it is also a responsibility… Technology’s potential is, and always must be, rooted in people’s faith.”

After being implicated in several data privacy-related scandals, from spying on customers via the Siri app to massive data breaches, Apple reframed its approach to data privacy and made data privacy and trust a key selling point. 

Apple has been limiting how much data apps can collect and use for several years. First, Apple allowed its users to turn off location tracking; then, it mandated that iOS app providers spell out what data they collect through “nutrition labels” in the App Store. In 2021, it took things a step further by announcing its new App Tracking Transparency (ATT) feature that requires apps to request permission from users before tracking them across other apps and websites.

Apple has a cross-functional approach to privacy governance, covering all company areas and including customer and employee data. The Legal Team has a Senior Director for Privacy and Law Enforcement Compliance. It also has a Privacy Engineering Team that partners with the Privacy Legal Team and dedicated Product Counsel to design products from the ground up to protect customer privacy.

Apple also has a Privacy Steering Committee that sets privacy standards for teams across Apple and acts as an escalation point for addressing privacy compliance issues for decision or further escalation. It also oversees instances where data for which Apple is responsible is managed or hosted by a third party on Apple’s behalf and reviews those third parties through audits and documentation.

Further, employees with access to Apple customer data and personal information must undergo an additional Privacy and Security Training course bi-annually or in response to updated laws such as the GDPR. 


Oracle states that data governance “does not come together all at once,” and an iterative approach is needed. Oracle’s data quality solutions are based on optimizing and leveraging information as an enterprise asset. The computing giant’s Enterprise Data Governance solution helps identify, secure, manage and even discover sensitive data in the database. 

Key data governance capabilities are enabled by Oracle Enterprise Metadata Manager (OEMM) and Oracle Enterprise Data Quality (EDQ).

From visualizations to sensitive database discovery results to automatic metadata discovery jobs, Oracle’s data governance functions provide improved quality and access and security to the core enterprise asset. The company data governance policy outlines establishing enterprise data strategies, identifying the right stakeholders, assigning accountabilities, and outlining the report status for data-focused initiatives.

The tech behemoth is now at the optimized level, where data governance is core to the business process and projects. Decisions are informed by data that provide quantifiable benefit/ cost/ risk analysis, and processes and policies are firmly established adopted, and continually revised to reflect business goals and objectives.

Oracle’s pushes for adopting an ongoing program and a continuous improvement process. OEMM harvests metadata from Data Marts and Data Warehouses, Extract Transform Load, Data Integration, Business Intelligence, and Big Data/Hadoop tools. This allows easy high-level visualization in metadata analysis and fast and straightforward data flow and lineage analyzer. 

To help prevent unintended data use within the organization, Oracle has integrated Data Governance and SOA Governance or Application Services Governance within the Oracle API Platform Cloud Service. Process owners provide the subject matter expertise required to understand the meaning of data within the context of their processes. In contrast, data owners bring an understanding of the processes and metrics using their data sets.


Google indexes the internet, and that means collecting huge amounts of data. Google published a lot of research in the academic community about data governance — Goods Whitepaper, which describes Google’s Data Catalog, made available to the world as GCP Data Catalog, and the whitepaper, which describes BigQuery. This commercial product allows organizations to do big data analytics. 

Since Google aggregates a lot of data, it complies with privacy principles. Don’t collect what you don’t need. Eliminate personal data that is irrelevant.

Google’s core competency in data privacy and data governance is expressed in the tools Google brings and builds to the public in Google Cloud Platform (GCP).

As an entity that makes money from Android, YouTube, ads, and GCP, Google has challenges as it’s scrutinized and must adhere to the regulations in all the countries it operates. However, Google produces products that any enterprise could use and brings the lessons into tools such as BigQuery, Data Catalog, and other big data capabilities it provides to the public.

Data Catalog in GCP is integrated into GCP so that as soon as you create a new data set, it pops up in Data Catalog without any interaction or registration. That allows setting up alerts and monitoring all new data additions without building complicated machine learning modules and utilities to detect and classify that data.

Learnings: Things to consider when planning your data governance strategy

  • Build standards into your existing process and implement them as engineering solutions. By approaching data governance during the design phase of the larger Enterprise Data strategy, you will be able to institutionalize “governance by design” into the engineering DNA — and apply it to data at every touchpoint. 
  • Consider implementing a modern data foundation with integrated toolsets. 
  • People and processes are just as important as tools and infrastructure

Data privacy is evolving from a regulatory/compliance issue into a strategic one. Establishing a robust but adaptable data governance practice that positions data privacy as an asset will likely elevate your data strategy while paving the way for future ethical data monetization efforts and AI development. Data privacy and innovation are not necessarily at odds. Instead, when taken together, they can serve as guideposts, lighting the way toward future growth and technological advancement.

More articles

Latest news