Why Does Clean Data Matter?
5 Key Reasons

We’ve all heard the familiar saying, “Garbage in, garbage out.” In construction, we could rephrase it as, “Bad input, bad outcomes.” Because while there’s no shortage in the quantity of data being produced at any given moment during a capital project, what matters even more is the quality.

But for all the discussions around the importance and usefulness of construction data, those things won’t matter if the data itself isn’t “clean” from the beginning.

This means that data must be clean at the outset — not simply at the start of the project, but how it’s keyed in or imported into a software program throughout construction.

So why, exactly, does having clean data matter? The five key reasons outlined here go far beyond the expected administrative benefits of reducing the need for labor-intensive data cleaning.


Dirty data is still more prevalent and costly than we may realize

Despite the industry increasingly embracing construction project data, not all data is usable. And therein lies the rub. To be considered clean (and therefore usable), data must be correctly input, accurate, current, accounted for, error-free, structurally consistent and complete. Otherwise, it’s considered dirty or bad data.

The potential consequences of using dirty data are very real: misinformed decisions, delays, compromised structural and personal safety, and rework, for example. And, of course, there are the associated incurred costs.

So how does this all translate monetarily? One study estimates that bad data cost the global construction sector more than $1.8 trillion in 2020, contributing to the $625 billion being lost to rework.[1] Those kinds of figures sound almost unreal.

It’s sobering to realize that much of the astronomical losses due to bad data are avoidable. And yet, its prevalence is more widespread than you might think. Given how substantial and lengthy capital projects can be, there are plenty of opportunities for bad data to creep in.

Consider the different kinds of data a project produces and relies on — timesheets, change orders, schedules, material price fluctuations, inventory, vendor data, safety data, test and inspection feedback, daily reports, etc. They can be found in paper documentation in a jobsite trailer or back-office file cabinet, computer spreadsheets, email inboxes, computer hard drives and smartphones.

With so much varied data being produced and used at any given moment, it may be impossible to know what data is duplicated, inaccurate or outdated. It’s a conundrum that can be further exacerbated as more disconnected point solutions are added to a tech stack; siloed data could be deemed questionable at best, leaving stakeholders wondering what’s accurate and what’s not.


Data collection, analytics and insights are increasingly as essential to growth

According to the InEight Global Capital Projects Outlook published in June 2023, 49% of owners and 47% of contractors who took part in the report’s survey ranked “data collection, analytics and insights” among the top five opportunities for growth in the next 12 months. On the flip side, and related to the point above, having poor data collection, analytics and insights is cited by 41% of respondents as being a top risk, up 7% over last year.

Clean data supports better analytics, providing more valuable insights into project performance and where to address inefficiencies. In fact, the accuracy and reliability of such data lessen the likelihood of mistakes and biases (including decisions inspired by bad data) that could lead to less-than-optimal project outcomes.

So it should come as no surprise that data from past projects, industry benchmarks and real-time inputs continues to show promise as a driver behind the increase in capital projects achieving cost and schedule outcomes, improving project certainty.


Data-fed industry and internal benchmarks are needed to achieve project predictability

One of the best ways to predict the future of capital projects is to look at what happened in the past. And when it comes to winning and better managing projects, contractors achieve this with realism built into their bids.

Tech-savvy project owners are appreciating this more surprise-free approach and increasingly expecting it. That realism must come from actual data. The cleaner it is, the more realistic the bids are and the more predictive the project outcomes can be.

When sourced from clean, real-time data, performance metrics from similar projects or industry benchmarks can give contractors and owners a real-world perspective on what to expect. How much did a task cost? How long did it take? What risk factors played out, and how did schedule or cost metrics respond to contingency plans? What mistakes were made, and how well did the fixes work? Were there surprises to account for in cost or schedule performance for individual tasks and the overall project?

Alternatively, referring to flawed data gives no basis for improvement and is a fast track to off-the-mark outcomes.


Clean data is a necessity for effective facilities management at turnover

Managing a built asset takes a lot more than it did years ago. With completed structures incorporating designs, sustainable materials and digital technologies that are more sophisticated than ever, it requires more and better data to monitor and operate them.

That data should begin accumulating at the project’s start (not as the completion date approaches), telling a comprehensive story of the evolution of the build and all its facets. Delivering clean data at turnover — from performance data to digital twins — assures owners their requirements were met and that any modifications or repairs during construction steered the project back toward meeting those requirements. Just as importantly, all that accurate, reliable data serves as the basis for better-informed decision making and, therefore, more efficient asset management. Facilities teams can then continue adding to it in real time, creating an ongoing asset management narrative, so the built asset enjoys a long, usable life.


There’s a very real potential for undermining your bottom line without it

What is the financial or reputational damage of using dirty data?

This goes beyond the global cost mentioned earlier. If contractors and owners calculated the losses incurred to their companies due to using bad data, it would not bode well for contractors. It would compromise their bottom line and ability to secure future work while weakening owner confidence and trust.

It puts a sharper focus on the benefits of — and need for — clean, actionable data. Plenty of construction companies are forging ahead with connected, cloud-based technology that facilitates gathering and processing clean, streamlined data that helps them achieve the project outcomes that make owners happy. After all, clean data is trusted data.


Keeping it clean

Despite its slow adoption rate in the industry, construction technology is alleviating and fixing many data quality issues plaguing projects. Software programs designed to collect, store, automatically update, calculate and analyze your data in real time can help deliver the clean data you rely on to manage projects and business growth.

But it’s not about loading up on several programs and thinking you’re covered. You’ll get the full value and benefit from all that clean data if those programs are connected — or even better if it’s a single integrated platform.

Talking through your specific situation and goals can steer the technology process in the right direction. We have brief conversations with other companies in the industry that are curious about the same things you are, including how to get and maintain clean data to better support their projects. Let us know if you’d like to schedule a time to talk with us.


Sign up for our monthly blog newsletter today and stay up to date on the latest industry news.


[1] Harnessing The Data Advantage In Engineering And Construction, FMI, 2021.

Blog Tags