Data Management Concepts for Sustainability - Part 4

This article was written as an expansion of our white paper “Choosing Sustainability Management Software for your Business” published in July 2011.  If you’re looking for information on how to make your software selection, check out the full article.  If you just want to make sense of this particular topic, keep reading.  Whether you like this article or not, we want to hear from YOU so that we can continue to provide the best insight for YOU, our readers… 

Our series on Sustainability Software continues with “Data Management Concepts for Sustainability”.  In this article (Part 4 of 4), we’ll complete the introduction and definition of key Data Management terms (read Part 3 here).  Our end goal with this series is to enable YOU, as the Business Leader, to feel more comfortable in a technical discussion related to the various areas of Data Management, especially as related to the care and feeding of Sustainability Software packages. Being able to “talk the talk” is the best defense in the technology wilderness.  Just remember, at the basis of any technical term is a common sense business notion, and staying grounded to this notion will help keep your conversations from drifting astray.

Data Integration

Data Integration is one of the most difficult of the activities covered in this series because it involves most of the different activities working in concert with each other.  For example, it is implicit in Data Movement between systems where the Data models are different.  Suppose we have data in our Accounting system that will be used in a cost calculation algorithm (method) in our Sustainability Software.  To do this, we need to copy the Accounting data, then reshape it to conform to the load utilities for our package and proceed with the load.  This setup entails numerous subtleties including the cross referencing of the source data model in the Accounting System with the format of the import utility.  This is called Field Mapping and it’s not just an easy matching question where you can get the first few right and guess the rest.  Examples will help us here.

  • Suppose we need to deal with quantity shipment data and the target model is asking for unit prices and volumes.  We might need to deduce the carbon content per gallon from the available carbon content per fifty five gallon barrel, or just divide by 55. 
  • A more complex example involves translation from the English System to the Metric System (raise your hand if you can do this without a calculator).
  • Another example would be the rules concerning the potential for rounding errors for large quantities.
  • A final classic example is how to deal with Asian names (commonly listed with the surname first) being transferred into a system with a European paradigm (where the surname is listed last).

Data Integration is expensive to build and more expensive to operate.  SaaS is a way to avoid the Integration Tax to the extent possible since it has already been built into many of the downstream systems you’ll be using.

Data Mining

Data Mining is the last major topic to be introduced.  It also involves smatterings of the others, but has a unique ad-hoc character at its essence.  

Suppose we have a database that describes product production events in a manufacturing setting.  Suppose also that we wish to learn more about root causes of some recurring problem that has escaped previous attempts to solve it and choose to “look at all the occurrences at once”.  Someone who is expert in the data itself, as well as all the business processes it describes could attempt to construct queries that will reveal common conditions that led to the problem occurrences.  For example, he might notice they all tend to fall in the first half hour of their respective production runs.  Further drill down might reveal they all involve late shipments from the same raw material vendor, while production runs with timely shipments from the same vendor seem to go without mishap.  This would lead us to suspect potential spoilage or lack of maturity in the late arriving material.  Data Mining is a spiral learning discipline.  One spirals in to a common cause, or spirals out to learn the nature of the Cosmos. 

Conclusion

We hope that as a result of this information, albeit somewhat high-level, you’ll find a greater degree of ease in approaching Data Management problems and their solutions with any Sustainability Software package that you may be considering.   As with the rest of this series, our goal is to guide you through each of these complex topics and bring them safely toward a solution that will provide you with robust and accurate data and data management practices that will last for years to come. 

Now that you’ve read this article, tell us what you think!  And be sure to check out the full white paper.