Exploring the World of Data Governance

This article was written as an expansion of our white paper “Choosing Sustainability Management Software for your Business” published in July 2011.  Enjoy:

Now that you’ve captured all this sustainability related data, you want to measure and report on it.  To ensure that you measure and report effectively, you’ll want to consider putting some data governance policies in place.  This isn’t in an effort to introduce bureaucracy for the sake of shuffling papers from one reporting cycle to the next.  It’s meant to protect the significant investment in time and money that you’ve made to acquire and implement your sustainability software system, as well as to maintain the integrity of the all the data that you’ve captured. Your data is an important company asset and you should treat it as such.

The field of data governance and data management encompasses a wide swath of the IT world – more than we could cover in a whole collection of white papers and blog articles. In this article, we’ll use “data governance” to encompass a wide variety of disciplines.  Our intent is to introduce you to these disciplines (assuming you aren’t already an expert) so you can better figure out what policies, processes, and procedures you may want to consider implementing.

Data Governance...

...is the overarching framework to ensure that you define, collect, transfer, report, calculate, and otherwise use your data in a consistent manner across your entire business and user base.  It ensures that you use common data definitions amongst different people and groups and designates a decision maker – sometimes called a data owner or data steward – for each of your data elements.  Part of the data governance process is inherently collaborative – you need to get everyone on board with the direction you are going and to get their input into the results.  Once you’ve got it in place and agreed to, however, you need to give your data governance leadership the authoritative power to drive adherence to the standard.  Without the power of enforcement, it will be difficult to be successful in maintaining your data’s quality and the overall integrity of your reporting solution.

Data Quality Management... 

...encompasses your efforts to make sure that you get good data into your system up front and that the results returned by your system are consistent and accurate.  (Keep the phrase in mind, “garbage in, garbage out”).  This may range from ensuring that you’ve got the proper checks and “edits” in place on the data entry screen (i.e. users select the state or country from a defined list instead of entering it in a free form text field) to building logic that automatically corrects common mistakes or to otherwise normalize the data (i.e. determining that NY, N.Y., and New York are all equivalent).  A further focus of data quality management is ensuring that you have, and maintain, a high degree of data completeness, as you can make better decisions when you see the whole picture instead of just a piece of the puzzle.  Getting good data in doesn’t guarantee that your reports will be accurate, but it’s an important start.

Data Definitions... 

...will help ensure that your inputs and outputs are understood properly by all parties.  These may include simple things like an agreed to definition of an acronym like “GHG” (Green House Gas) or “CO2e” (CO2 equivalent includes CO2, methane, etc. – you need to spell out the etc.!).  It may incorporate calculations for metrics that you report – i.e. is it tonnes of CO2e, pounds of CO2e, kilograms of CO2e, etc. – you’ll want the folks at each of your locations to be consistent.  Getting your data defined will help everyone responsible for creating your reports and reading your reports get on the same page instead of making them jump to their own conclusions.

Meta-Data... 

...is “data about data”.  This may sound rather odd when you first hear it.  What it really means is that you want to define, track, and maintain certain attributes that help inform you more about the data in your system.  Meta-data will tell you the size of the data field, the format of the data, the age of the data, the origin of the data, etc.  For instance, you might determine from meta-data that a certain utility bill was entered by Jim Jones on May 9th, 2011 from his work laptop at 10AM Central time and included only the US$ cost of natural gas supplied by Kansas Gas Service for the period of April 1, 2011 through April 30, 2011.  Your summary report may only show that you used $127.52 worth of natural gas in 2Q11, but if you wanted to figure out where that number came from, cross check it against an e-bill, or otherwise audit what your team is entering, the meta-data would come in extremely handy.

Data Retention... 

...deals with the duration of time that you expect or need to keep your data.  Do you need to keep it around for 2 years, 5 years, 10 years, or some other time horizon?  If you keep it around for a longer period of time, your data storage costs go up.  You might be able to get around that by burning the data off to an archival storage mechanism (i.e. a DVD) that is cheaper than a hard drive, but that may only work if you can wait a week to retrieve it and load it back into your system.  Figuring out the retention time and the amount of time you are willing to wait to recover your data will go a long way toward helping you identify the right solution and figure out how much it’s going to cost.

Data Privacy... 

...addresses the question of who has permission to see various elements of your data.  For example, you may want to have your data auditor or your system administrator see that Jim Jones entered some data on May 9th.  However, that same level of detail would more likely be kept out of an aggregated report that you publish to your external customers.  You might also want to be able to restrict certain system users from accessing various personally identifiable data that might be stored in the system.  It may even be the case that certain information wouldn’t get stored at all depending on your Data Privacy guidelines.  Depending on your industry, there may be standards such as PCI (Payment Card Industry), Data Security, or CPNI (Customer Proprietary Network Information) in the telecommunications industry that you need to consider. 

Find out how you can become a better sustainability leader in one of our latest blogs.