
NEWSLETTER
Wpisz swój adres e-mail i zyskaj e-booka
Bez niechcianej poczty ani reklam
Tylko merytoryczne treści z obszaru digitalizacji produkcji
Preparing for AI does not start with a model, a dashboard, or a vendor demo. It starts with a simple question: do your data sources, names, definitions, and responsibilities describe production the way it really works?
If the same machine has three names in ERP, Excel, and on the shop floor, reports will start to disagree. If operators enter downtime as “failure,” “stop,” “other,” or leave the field blank, AI will not know whether the issue was a technical failure, missing material, or poor data entry. If production orders are not linked to machine data, MES will collect numbers without context.
That is why preparing for AI should not begin with tool selection. The first step is cleaning up your company’s data: sources, definitions, identifiers, ownership, and quality rules.
Gartner reports that poor data quality costs organizations at least $12.9 million per year on average. In a July 29, 2024 release, Gartner projected that by the end of 2025 at least 30% of GenAI projects would be abandoned after proof of concept, including because of poor data quality, rising costs, or unclear business value. IDC, cited in Oracle material, says 80% of time spent with data may go to finding, preparing, and protecting it, while only 20% goes to analysis.
Start by deciding which decisions your data should support.
Then work through these eight steps:
Only after that should you choose reports, AI models, operator screens, or integrations.

Too many AI projects start with tools and models. But without clean, consistent data, even a strong model will produce results that are hard to defend in production, quality, or maintenance.
Preparing for AI means checking whether you have historical data, shared definitions, consistent identifiers, and clear sources of truth. Only then can you decide whether AI has enough to learn from and whether its recommendations will make sense on the shop floor.
Data cleanup means deciding how your plant describes production with numbers. In manufacturing, that includes:
Without that, an OEE report may show something different than a shift manager’s report. MES may work, but some data will still need manual correction. AI may find patterns that look useful, but do not match what is happening on the shop floor.
<div class=”textarea_pp”>
<p style=”font-size: 28px;”><b>Get 5 chapters of the book for free!<b></b></b></p>
<p style=”font-weight: 400;”>Join the newsletter and gain access to 40% of the book
<em><strong>”</strong><strong>15 Steps to Buying an Information System</strong><strong>”</strong></em><strong>.</strong></p>
</div>
<!– wp:spacer {“height”:”50px”} –>
<div class=”wp-block-spacer” aria-hidden=”true”></div>
A common mistake is starting with the data you already have. A better starting point is the decision you want to make faster and with more confidence.
| Decision | Data you need |
|---|---|
| Where do we lose time during the shift? | Downtime, cycle times, stop reasons |
| Which products create the most defects? | Product, batch, process parameters, quality results |
| Is the production plan realistic? | Machine capacity, orders, changeovers |
| Can AI predict a quality issue? | OK/NOK history, process parameters, batches |
| Can MES replace manual reports? | Operation statuses, machine data, standards |
If nobody knows which decision a data field supports, teams quickly start collecting data “just in case.” That raises project cost and makes analysis harder.
Manufacturing data is often spread across many places. Some data sits in ERP or SCADA. Some comes from machine panels, Excel files, or paper forms.
A data source map should show:
Based on that map, you can build a data dictionary.
Without a dictionary, teams quickly start arguing about basic terms. Should changeover count as planned downtime? Should stops shorter than two minutes be recorded? Should waiting for material affect production, logistics, or planning?
| Area | What needs to be defined |
|---|---|
| Production | Order, operation, cycle, standard, changeover |
| Maintenance | Failure, intervention, planned stop |
| Quality | OK, NOK, defect, scrap |
| Logistics | Batch, location, material usage |
| People | Workstation, role, crew, permissions |
Identifiers matter just as much.
The same machine may appear as LINE_01, L1_PACK, Packer 1, and “old line.” To an employee, this is one asset. To a system, it may look like four different objects. In many cases, a simple mapping table is enough to fix the problem.
You do not need to audit the whole company at once. Start with one line, one product family, or one process that creates a lot of losses.
| Test | Question | Example issue |
|---|---|---|
| Completeness | Are required fields filled in? | 35% of downtime events have no reason |
| Consistency | Do systems show the same thing? | ERP shows a different order than the operator panel |
| Timeliness | Does data arrive on time? | Quality report is available the next day |
| Uniqueness | Are records duplicated? | The same batch is entered twice |
| Accuracy | Are values realistic? | Cycle time equals 0 seconds or 999 minutes |
A short test like this will quickly show where the project may fail.
A MES system does not need every piece of data on day one. It needs the data that best describes how production runs.
For the first stage, prepare:
You also need to define the source of truth.
ERP often works well for orders, item indexes, and production plans. SCADA and PLC systems provide machine signals. MES connects shop floor data with context: product, operation, order, operator, quality, and status.
| Data | Source of truth |
|---|---|
| Production plan | ERP or APS |
| Number of cycles | Machine, PLC, SCADA |
| Downtime reason | MES, operator panel |
| Quality inspection result | MES or QMS |
| Operation status | MES |
| Product index | ERP |
explitia offers the Production Portal as a platform for manufacturing companies that want to implement MES and tools for planning, monitoring, and improving production. It can collect information from machines, sensors, PLCs, ERP systems, and operator forms in one place.
Some context is known only by people: the reason for a stop, a material issue, an unusual changeover situation, or a quality decision. That is why operator panels and simple forms can be very useful.
AI needs historical data, well-described cases, and labels. If it should predict defects, it needs OK/NOK history. If it should predict failures, it needs well-described service events.
| AI goal | Data you need |
|---|---|
| Predicting machine failure | Machine work history, alarms, maintenance interventions |
| Predicting defects | OK/NOK data, process parameters, material, batch |
| Supporting production planning | Orders, capacity, people availability, changeovers |
| Downtime analysis | Times, reasons, machine, product, shift |
| Operator recommendations | Current parameters, instructions, similar cases |
Preparing for AI means removing guesswork from your data. AI learns from what was recorded, how it was described, and whether the same event means the same thing across systems.
Data cleanup often shows that the problem is not only in the data. It may be in the way people work.
Example: the downtime reason list has 47 options. Several sound almost the same. Operators have little time, so they choose “other.” After a few months, 60% of stops fall into one category. The report exists, but it does not show what to improve.
A better option:
Data must fit production work. If entering data slows operators down, they will fill it in later or choose the easiest option, not always the right one.
Data without an owner gets worse almost invisibly. Someone must decide who can add a new downtime reason, change a cycle standard, or create a new product index.
| Data area | Business owner |
|---|---|
| Orders and product indexes | Planning |
| Machines and signals | Maintenance / automation |
| Downtime | Production |
| Quality | Quality department |
| Reports | Process owner |
| Permissions | Area managers |
This kind of ownership protects you from a situation where everyone corrects data according to their own rules.
Choose one production area and answer these questions:
If the answer is “no” to most of these questions, start with one line and one problem.
More data does not mean better decisions.
If production, quality, and maintenance define downtime differently, the report becomes the start of an argument.
Operators know which data is hard to enter and where the system does not match shop floor work.
AI can help with analysis, but it will not clean up years of inconsistent entries without rules, context, and labels.

Good data does not guarantee project success, but poor data will weaken it quickly.
If you want to implement AI, MES, or reporting, start with one question:
Does our data describe production the way it really works?
If the answer is uncertain, start with a source map, a data dictionary, identifiers, and a data quality test in one area. You will see which data is reliable, which data is missing, and which AI, MES, or reporting scope makes the most sense for the first stage.
No. You need data that is good enough for the first implementation scope. It is usually best to start with one line, one process, or one problem.
Most MES projects need data about machines, workstations, products, operations, orders, cycle times, downtime, quality, material batches, users, and production statuses.
MES needs data for day-to-day production management. AI needs historical data, well-described cases, and labels such as OK/NOK, failure type, or downtime reason.