Preparing for AI: Check Whether Your Manufacturing Data Is Ready

Preparing for AI does not start with a model, a dashboard, or a vendor demo. It starts with a simple question: do your data sources, names, definitions, and responsibilities describe production the way it really works?

AI, MES, and reporting will not fix messy data. They will show it faster.

If the same machine has three names in ERP, Excel, and on the shop floor, reports will start to disagree. If operators enter downtime as “failure,” “stop,” “other,” or leave the field blank, AI will not know whether the issue was a technical failure, missing material, or poor data entry. If production orders are not linked to machine data, MES will collect numbers without context.

That is why preparing for AI should not begin with tool selection. The first step is cleaning up your company’s data: sources, definitions, identifiers, ownership, and quality rules.

Gartner reports that poor data quality costs organizations at least $12.9 million per year on average. In a July 29, 2024 release, Gartner projected that by the end of 2025 at least 30% of GenAI projects would be abandoned after proof of concept, including because of poor data quality, rising costs, or unclear business value. IDC, cited in Oracle material, says 80% of time spent with data may go to finding, preparing, and protecting it, while only 20% goes to analysis.

How preparing for AI should start

Start by deciding which decisions your data should support.

Then work through these eight steps:

Map your data sources: ERP, SCADA, PLC, Excel, paper, operator forms, and quality systems.
Create a shared data dictionary: downtime, failure, changeover, shortage, scrap, and cycle.
Clean up identifiers for machines, products, orders, batches, and workstations.
Check data quality: missing fields, duplicates, delays, and incorrect values.
Define the source of truth for each type of data.
Set the data scope for MES, AI, or reporting.
Assign data owners.
Check whether the data fits real shop floor work.

Only after that should you choose reports, AI models, operator screens, or integrations.

Preparing for AI - 8 steps to the right tool (a photo of a production robot)

Preparing for AI starts with data

Too many AI projects start with tools and models. But without clean, consistent data, even a strong model will produce results that are hard to defend in production, quality, or maintenance.

Preparing for AI means checking whether you have historical data, shared definitions, consistent identifiers, and clear sources of truth. Only then can you decide whether AI has enough to learn from and whether its recommendations will make sense on the shop floor.

What data cleanup means in a manufacturing company

Data cleanup means deciding how your plant describes production with numbers. In manufacturing, that includes:

one definition of downtime, failure, shortage, scrap, and changeover,
shared names for machines, lines, products, operations, and orders,
machine data connected with production context,
clear rules for changing data,
error detection,
consistent reports for operators, team leaders, managers, and executives.

Without that, an OEE report may show something different than a shift manager’s report. MES may work, but some data will still need manual correction. AI may find patterns that look useful, but do not match what is happening on the shop floor.

Książka Adriana Stelmacha "15 kroków do zakupu systemu informatycznego" - dowiedz się więcej o tym, jak wybrać odpowiedni system IT dla swojej fabryki!

 
<p style=”font-size: 28px;”><b>Get 5 chapters of the book for free!<b></b></b></p>
<p style=”font-weight: 400;”>Join the newsletter and gain access to 40% of the book
<em><strong>”</strong><strong>15 Steps to Buying an Information System</strong><strong>”</strong></em><strong>.</strong></p>

</div>
<!– wp:spacer {“height”:”50px”} –>
<div class=”wp-block-spacer” aria-hidden=”true”></div>

Download the book

Start with decisions, not tables

A common mistake is starting with the data you already have. A better starting point is the decision you want to make faster and with more confidence.

Decision	Data you need
Where do we lose time during the shift?	Downtime, cycle times, stop reasons
Which products create the most defects?	Product, batch, process parameters, quality results
Is the production plan realistic?	Machine capacity, orders, changeovers
Can AI predict a quality issue?	OK/NOK history, process parameters, batches
Can MES replace manual reports?	Operation statuses, machine data, standards

If nobody knows which decision a data field supports, teams quickly start collecting data “just in case.” That raises project cost and makes analysis harder.

Data map, dictionary, and identifiers

Manufacturing data is often spread across many places. Some data sits in ERP or SCADA. Some comes from machine panels, Excel files, or paper forms.

A data source map should show:

where the data comes from,
who enters it or generates it,
how often it is updated,
who uses it,
where errors appear.

Based on that map, you can build a data dictionary.

Without a dictionary, teams quickly start arguing about basic terms. Should changeover count as planned downtime? Should stops shorter than two minutes be recorded? Should waiting for material affect production, logistics, or planning?

Area	What needs to be defined
Production	Order, operation, cycle, standard, changeover
Maintenance	Failure, intervention, planned stop
Quality	OK, NOK, defect, scrap
Logistics	Batch, location, material usage
People	Workstation, role, crew, permissions

Identifiers matter just as much.

The same machine may appear as LINE_01, L1_PACK, Packer 1, and “old line.” To an employee, this is one asset. To a system, it may look like four different objects. In many cases, a simple mapping table is enough to fix the problem.

How to check data quality

You do not need to audit the whole company at once. Start with one line, one product family, or one process that creates a lot of losses.

Test	Question	Example issue
Completeness	Are required fields filled in?	35% of downtime events have no reason
Consistency	Do systems show the same thing?	ERP shows a different order than the operator panel
Timeliness	Does data arrive on time?	Quality report is available the next day
Uniqueness	Are records duplicated?	The same batch is entered twice
Accuracy	Are values realistic?	Cycle time equals 0 seconds or 999 minutes

A short test like this will quickly show where the project may fail.

Preparing for AI and MES data

A MES system does not need every piece of data on day one. It needs the data that best describes how production runs.

For the first stage, prepare:

plant structure,
product and operation lists,
production orders,
operation statuses,
standards and cycle times,
changeover types,
downtime dictionary,
OK/NOK quality data,
material batches,
users and roles.

You also need to define the source of truth.

ERP often works well for orders, item indexes, and production plans. SCADA and PLC systems provide machine signals. MES connects shop floor data with context: product, operation, order, operator, quality, and status.

Data	Source of truth
Production plan	ERP or APS
Number of cycles	Machine, PLC, SCADA
Downtime reason	MES, operator panel
Quality inspection result	MES or QMS
Operation status	MES
Product index	ERP

explitia offers the Production Portal as a platform for manufacturing companies that want to implement MES and tools for planning, monitoring, and improving production. It can collect information from machines, sensors, PLCs, ERP systems, and operator forms in one place.

Some context is known only by people: the reason for a stop, a material issue, an unusual changeover situation, or a quality decision. That is why operator panels and simple forms can be very useful.

Preparing for AI: what data does AI need?

AI needs historical data, well-described cases, and labels. If it should predict defects, it needs OK/NOK history. If it should predict failures, it needs well-described service events.

AI goal	Data you need
Predicting machine failure	Machine work history, alarms, maintenance interventions
Predicting defects	OK/NOK data, process parameters, material, batch
Supporting production planning	Orders, capacity, people availability, changeovers
Downtime analysis	Times, reasons, machine, product, shift
Operator recommendations	Current parameters, instructions, similar cases

Preparing for AI means removing guesswork from your data. AI learns from what was recorded, how it was described, and whether the same event means the same thing across systems.

Do not move a bad process into software

Data cleanup often shows that the problem is not only in the data. It may be in the way people work.

Example: the downtime reason list has 47 options. Several sound almost the same. Operators have little time, so they choose “other.” After a few months, 60% of stops fall into one category. The report exists, but it does not show what to improve.

A better option:

shorten the list of reasons,
separate the cause from the effect,
show the most common reasons for each machine,
require comments only for selected categories,
remove duplicates and unused options.

Data must fit production work. If entering data slows operators down, they will fill it in later or choose the easiest option, not always the right one.

Who owns the data?

Data without an owner gets worse almost invisibly. Someone must decide who can add a new downtime reason, change a cycle standard, or create a new product index.

Data area	Business owner
Orders and product indexes	Planning
Machines and signals	Maintenance / automation
Downtime	Production
Quality	Quality department
Reports	Process owner
Permissions	Area managers

This kind of ownership protects you from a situation where everyone corrects data according to their own rules.

Preparing for AI readiness test

Choose one production area and answer these questions:

Does every machine have one identifier?
Is every order connected to an operation and product?
Do you know the real cycle times?
Do downtime events have specific reasons?
Is quality data connected with batch and operation?
Do ERP, shop floor, and Excel reports show the same numbers?
Is it clear who corrects incorrect data?
Do management and production trust the same reports?

If the answer is “no” to most of these questions, start with one line and one problem.

Common mistakes when preparing for AI, MES, and reporting

1. Collecting everything without a goal

More data does not mean better decisions.

2. Missing shared definitions

If production, quality, and maintenance define downtime differently, the report becomes the start of an argument.

3. Leaving operators out

Operators know which data is hard to enter and where the system does not match shop floor work.

4. Believing AI will fix the data

AI can help with analysis, but it will not clean up years of inconsistent entries without rules, context, and labels.

Preparing for AI - preventivní the common mistakes (two managers checking data on a tablet)

Before you implement AI, MES, or reporting

Good data does not guarantee project success, but poor data will weaken it quickly.

If you want to implement AI, MES, or reporting, start with one question:

Does our data describe production the way it really works?

If the answer is uncertain, start with a source map, a data dictionary, identifiers, and a data quality test in one area. You will see which data is reliable, which data is missing, and which AI, MES, or reporting scope makes the most sense for the first stage.

FAQ: preparing for AI, MES, and reporting

Do you need perfect data before implementing MES?

No. You need data that is good enough for the first implementation scope. It is usually best to start with one line, one process, or one problem.

What data is needed for MES implementation?

Most MES projects need data about machines, workstations, products, operations, orders, cycle times, downtime, quality, material batches, users, and production statuses.

How is MES data preparation different from AI data preparation?

MES needs data for day-to-day production management. AI needs historical data, well-described cases, and labels such as OK/NOK, failure type, or downtime reason.

Poprzedni wpis

30 04.2026

Preparing for EMS Implementation: 7 Steps Before Energy Starts Eating Into Your Margin

Następny wpis

22 04.2026