As a rule, data context is not formally documented and exists only as “tribal knowledge.” It is a technical debt that leads to quality issues as well as protracted implementation times for new data sources and consumers. Data lineage captures how data travels through diverse processes and facilitates backward error tracing to their sources. Data analysts use this capability to replay specific portions of the dataflow for debugging or regeneration. Missing data context and lineage account for a high percentage of difficulties encountered downstream.
Data governance covers all aspects of data quality and access control. It is a system of access rights and accountability for managing enterprise data assets. The BigRio data practice incorporates processes, roles, standards, and metrics to maximize data quality at the root source, safeguarding rules for duplication, completeness, freshness, etc., and ensures compliance and security. We recommend that senior management implements and sponsors change to pave the way for effective enterprise-wide data governance policies with appropriate delegation and accountability.