Why prompt-based AI is failing on the factory floor

Prompt-based AI is starting to hit the wall where manufacturing always does: against physics, latency, and risk.

That is the practical warning embedded in a recent Robotics & Automation News opinion by Massimiliano Moruzzi, founder and CEO of Xaba.ai, titled “Why industrial AI must be trained on physics, not prompts.” Moruzzi’s central claim is blunt: the same prompt-centric approach that works tolerably well in chat interfaces does not map cleanly onto a factory where the cost of a wrong answer is not a bad paragraph but a damaged tool, a stopped line, or a safety incident.

What changes now is not that AI has suddenly become less capable. It is that industrial buyers are asking it to operate in conditions where variability is the norm, not the exception. A prompt can describe intent. It cannot reliably encode the hidden state of a robot cell, the wear on a gripper, the drift in a sensor, or the way a tolerance stack-up changes the legality of a motion plan. Once those variables matter, prompt-only workflows start to look less like automation and more like guesswork with a polished interface.

Why prompts fail on the factory floor

The appeal of prompt-based industrial AI is obvious. It lowers the barrier to entry, lets non-specialists ask for help, and can accelerate tasks like documentation, troubleshooting, and high-level planning. In a digital context, a wrong answer is usually recoverable. In a physical system, recovery is expensive, slow, and sometimes dangerous.

That distinction is the heart of the problem. Prompt-driven systems generally assume a stable context and a sufficiently complete description of the situation. Factory environments rarely offer either. Parts arrive out of spec. Fixtures loosen. Cycle times drift. Sensors saturate. Material behavior changes with temperature, humidity, or lot variation. A system that reasons only from textual instructions has no intrinsic understanding of torque limits, collision envelopes, backlash, friction, compliance, or the sequence dependencies that make one motion safe and another catastrophic.

Moruzzi’s argument in the Robotics & Automation News piece is that industrial AI must be trained on physics, not prompts, precisely because prompt models can misreason when the real world stops matching the clean assumptions inside the model’s conversational layer. That is not a subtle failure mode. It is the difference between a helpful suggestion and an unsafe action.

This is why the most serious industrial teams are not asking, “Can the model explain the next step?” They are asking, “Can it infer the state of the machine, understand constraints, and choose an action that remains valid under uncertainty?” A factory-floor system must be robust to missing data, sensor noise, process drift, and unmodeled interactions. Prompting alone does not provide that robustness.

From prompts to intent: the physics-informed shift

The better alternative is not to abandon AI, but to change what the system is learning.

Instead of asking a model to generate instructions from language alone, physics-informed and intent-driven systems encode goals, constraints, and known physical relationships directly into the control stack. In practice, that means a system is not merely told what to do in prose; it is trained to understand what outcome is desired, what constraints cannot be violated, and how the underlying process behaves.

That distinction matters. Intent-driven systems are built to translate goals into actions while accounting for state, uncertainty, and feasibility. Physics-informed methods can incorporate models of motion, force, energy, kinematics, dynamics, and process envelopes. Model-based control adds a layer of planning that checks whether a path is physically valid before the machine executes it. Uncertainty quantification tells operators not just what the system proposes, but how confident it is and where the risk lies.

This is the architectural break from prompt-led pilots. A prompt-based workflow often sits on top of the operation, helping humans interpret or summarize. A physics-informed system sits inside the operation, shaping decisions before the machine acts. That is why it can adapt more reliably to variability: it is not trying to improvise a physical process from language alone; it is optimizing within the rules of the process itself.

That does not mean every factory needs a bespoke scientific model for every task. It does mean vendors should stop treating natural-language prompting as the primary interface for control-critical automation. Language can still be useful for operator interaction, exception handling, and explainability. But the execution layer needs a stronger foundation than text generation.

Rollout playbook: what operators and vendors must do next

The most useful question now is not whether physics-informed AI is elegant. It is how to deploy it without creating another layer of untested complexity.

For operators and vendors, the rollout should start with a narrow, measurable use case where the downside of failure is understandable and contained. Pick a cell, a process window, or a maintenance task where baseline performance is already instrumented. Then compare a prompt-based approach with a physics-informed baseline under the same operating conditions. If the new system cannot outperform or at least match the baseline on reliability and safety, it does not belong in production.

The evaluation criteria should be concrete:

Downtime rate: does the system reduce unplanned stoppages, or does it introduce new failure modes?
Fault recovery time: when the process deviates, how quickly can the system detect and recover?
Safety margin adherence: does every recommended action stay within defined motion, force, temperature, and clearance limits?
Constraint violation frequency: how often does the system attempt an infeasible or noncompliant action?
Uncertainty calibration: when the model is unsure, does it say so accurately, and do humans get a chance to intervene?
Drift sensitivity: does performance degrade gracefully as parts, tools, or conditions change?

Those metrics should be paired with deployment guardrails. Keep a human in the loop for the first production phases, but make the human oversight meaningful: the operator should be reviewing constrained action plans, not just clicking through alerts after the fact. Instrument the pilot with high-fidelity logging so the team can trace why the system chose a path, what state it believed the machine was in, and where its confidence dropped.

Vendors, meanwhile, need to rethink product roadmaps. The priority should be hybrid architectures that combine learned intent with physics-aware planning and validation. If the model cannot represent constraints explicitly, it should at least route decisions through a verifier that can. If the system cannot quantify uncertainty, it should not be allowed to issue autonomous commands in high-consequence contexts.

That also changes validation methods. Industrial AI should be tested against perturbations, not just happy-path demos. Feed it broken parts, sensor dropouts, timing delays, and process drift. Measure whether it maintains safe behavior when the environment is partially observable. A system that looks impressive in a controlled lab but fails under noisy, real-world conditions is not ready for the plant floor.

The broader lesson from the Robotics & Automation News opinion is not that prompts are useless. It is that prompts are the wrong primary abstraction for physical automation. In factories, intelligence has to survive contact with matter. That means learning goals, constraints, and dynamics—not just instructions.

The next wave of industrial AI will be judged less by how fluently it talks and more by how safely it behaves when the process stops being convenient.

Why the factory floor is forcing AI beyond prompts

Why prompts fail on the factory floor

From prompts to intent: the physics-informed shift

Rollout playbook: what operators and vendors must do next

AI News Desk

OpenAI’s reported Apple legal threat is really about platform control, not just one stalled integration

Meta’s smart-glasses price cut is a live stress test for AI wearables

AWS adds Chrome enterprise policies to Bedrock AgentCore Browser, putting AI browsing under enterprise contro…