A machine stops. The operator calls maintenance. The technician walks over, looks at the machine, and starts troubleshooting from scratch. Thirty minutes pass. Then an hour. The fix turns out to be a tripped thermal overload that took 2 minutes to reset, but finding it took 58 minutes because nobody followed a structured process.
This scenario plays out in factories every day. The difference between a 15-minute repair and a 90-minute repair is rarely the complexity of the fix. It is the process used to get there. Plants that follow a structured breakdown response consistently achieve 25-40% lower Mean Time to Repair (MTTR) than plants that rely on individual technician judgment alone.
This guide lays out a six-step process that works for any equipment breakdown, from a jammed conveyor to a failed servo drive.
The First 5 Minutes Matter
Research from maintenance benchmarking studies shows that 40% of total repair time is spent on diagnosis, not on the actual repair. In other words, technicians spend more time figuring out what is wrong than fixing it.
The first 5 minutes after a breakdown determine how the next hour will go. A technician who arrives with no information starts from zero. A technician who arrives knowing the machine's recent history, the exact symptoms, and the most common failure modes for that equipment can skip past the guesswork and get to the fix faster.
The six steps below are designed to compress that diagnostic window and make every minute count.
Step 1: Safety First
Before touching anything, make the area safe. This is non-negotiable.
- Apply lockout/tagout (LOTO) if you will be working on or near moving parts, electrical circuits, or pressurized systems. No exceptions.
- Check for immediate hazards: spilled fluids, exposed wiring, unusual smells (burning, chemical), steam or hot surfaces, moving parts that have not fully stopped.
- Clear the area of unnecessary personnel. Operators should step back unless they are needed for information.
- Wear appropriate PPE. This sounds obvious, but in the rush to get the line running, technicians skip gloves, safety glasses, or hearing protection. Do not be that person.
This step takes 1-2 minutes. Skipping it can cost a life. Every year, roughly 150 workers in the U.S. die from contact with equipment that should have been locked out.
Step 2: Capture Symptoms
Before you start disassembling anything, gather information. The symptoms you collect in the first few minutes are the foundation for everything that follows.
Talk to the operator. They were there when it happened. Ask these specific questions:
- What exactly happened? What did you see, hear, or smell?
- When did it start? Was it sudden or gradual?
- Were you running a different product or speed than normal?
- Did anything change recently? New material batch, settings adjustment, maintenance work done?
- Has this happened before?
Then look at the machine yourself:
- Error codes: Read and write down any fault codes on the HMI or controller. Do not clear them yet.
- Visual inspection: Broken parts, loose connections, leaks, discoloration, wear marks.
- Sound and smell: Grinding, squealing, burning smell, chemical odor.
- Temperature: Use an IR thermometer on motors, bearings, electrical panels. A motor running at 180F when it should be at 140F tells you something.
Write these symptoms down, even if it is just on your phone. You will need them for diagnosis, and they become critical data if the failure recurs.
Step 3: Check Known Fixes
Before diving into deep diagnosis, check if this is a known problem with a known fix. This single step can cut your repair time in half.
Check these sources in order:
- Your own memory. Have you seen this exact set of symptoms on this machine before?
- Work order history. Pull the last 10-20 work orders for this machine. Look for matching symptoms or fault codes.
- OEM troubleshooting guide. Most equipment manuals have a fault code reference table. Match the error code to the recommended diagnostic steps.
- Colleagues. Ask the other technicians on shift. "Machine 12 is throwing a E-47 fault with high motor temp. Anyone seen this before?"
In many plants, 60-70% of breakdowns are repeat failures: the same machine, the same failure mode, sometimes even the same part. If you can match the current failure to a previous one, you already know the fix. Apply it, verify it works, and move to Step 5.
If the failure is new or the known fix does not apply, move to Step 4.
Step 4: Diagnose
Systematic diagnosis means testing one thing at a time and eliminating possibilities. Resist the urge to start swapping parts randomly. That approach feels fast but usually wastes time and spare parts.
The Half-Split Method
For complex systems (electrical circuits, fluid systems, multi-stage processes), use the half-split method:
- Divide the system in half at its midpoint.
- Test whether the problem is in the first half or the second half.
- Take the faulty half and divide it again.
- Repeat until you isolate the failed component.
This approach narrows a 20-component system to the faulty component in 4-5 tests instead of 20. It is the same logic a doctor uses: "Does it hurt above the waist or below?"
Common Diagnostic Checks
| Symptom | Check First | Check Second |
|---|---|---|
| Motor will not start | Thermal overload, breaker | Contactor, wiring, motor windings |
| Excessive vibration | Loose mounting bolts, coupling | Bearings, shaft alignment, imbalance |
| Hydraulic pressure low | Fluid level, filter condition | Pump wear, valve leaks, cylinder seals |
| Sensor fault / false reading | Wiring, connector, sensor face (clean it) | Sensor replacement, PLC input card |
| Overheating | Airflow blockage, cooling system | Load condition, bearing friction, electrical fault |
| PLC fault / communication error | Communication cables, power supply | I/O module, program check, processor |
If diagnosis takes longer than 30 minutes, stop and reassess. Call in a second set of eyes. A fresh perspective catches things you have been staring past for half an hour.
Step 5: Repair and Verify
Once you have identified the failed component or condition, make the repair. Then verify it actually worked before handing the machine back to production.
Repair
- Replace or repair the failed component.
- Check adjacent components. If a bearing failed because of misalignment, replacing the bearing without correcting the alignment means you will be back in a few weeks.
- Use the correct parts. "Close enough" replacement parts cause 15-20% of repeat failures. Confirm part numbers match the BOM.
Verify
- Run the machine unloaded first. Check for normal sounds, vibration levels, and temperatures.
- Run a short production trial (10-20 parts). Check output quality. Compare to pre-failure specifications.
- Monitor for 15-30 minutes. Some intermittent failures only show up under sustained load. Stay and observe until you are confident the repair holds.
- Clear all fault codes and confirm the HMI shows normal operating status.
Do not rush this step. A repair that fails 2 hours later costs more than the extra 15 minutes you spent verifying it the first time.
Step 6: Document the Fix
This is the step most teams skip, and it is the one that pays the biggest long-term dividends. Every undocumented repair is knowledge lost. The next time this failure occurs, someone will start from zero again.
A good repair record takes 5 minutes to write and should include:
- Machine ID and location
- Date, time, and shift
- Symptoms (what the operator reported, what you observed)
- Fault codes (exact codes displayed)
- Root cause (what actually failed and why)
- Repair performed (specific parts replaced, adjustments made)
- Parts used (part numbers and quantities, for inventory tracking)
- Time to repair (total downtime from failure to production restart)
- Follow-up needed? (Does this need a root cause analysis? A PM schedule change? A design modification?)
If your documentation system is a paper work order that takes 20 minutes to fill out, people will skip it. Make it easy. A mobile form with dropdown fields and a photo upload works better than a 3-page form.
Reducing MTTR with a Structured Process
Following these six steps consistently reduces your MTTR in measurable ways:
| MTTR Component | Without Process | With Process | Improvement |
|---|---|---|---|
| Detection to response | 10-15 min | 5-8 min | ~40% |
| Diagnosis time | 25-45 min | 10-20 min | ~55% |
| Repair time | 20-40 min | 15-30 min | ~20% |
| Verification | 5-10 min (often skipped) | 10-15 min | Fewer rework calls |
The biggest time savings come from diagnosis. Capturing symptoms properly (Step 2) and checking known fixes (Step 3) eliminates the "wandering around guessing" phase that eats up most repair time.
Over a year, a plant with 200 breakdowns per month that reduces average MTTR by 20 minutes per event saves 4,000 minutes of downtime per month. At a typical production value of $50-200 per minute, that is $200,000 to $800,000 in recovered production annually.
Where Dovient Fits
Dovient is built around this exact workflow. Here is how it supports each step:
- Step 2 (Capture Symptoms): Technicians log symptoms through a mobile interface with structured fields. No writing paragraphs on paper. Tap the machine, select the symptom type, add a photo. Done in 60 seconds.
- Step 3 (Check Known Fixes): This is where Dovient saves the most time. When you enter symptoms, Dovient's AI-powered diagnostic engine immediately searches your plant's entire repair history for matching failures. It shows you what worked last time, ranked by relevance. Technicians report cutting diagnosis time by 40-60% in the first month.
- Step 6 (Document): Repair documentation happens as a natural part of closing the work order. The symptoms, diagnosis, and fix are already entered from earlier steps. The technician adds the resolution, parts used, and any follow-up notes. This takes 2-3 minutes because most of the data was captured during the repair.
- Pattern recognition: Dovient tracks every breakdown across your plant. When the same failure mode appears on multiple machines, or when a machine's breakdown frequency increases, it flags the pattern. This feeds directly into your RCA process by showing you which failures deserve deeper investigation.
If your team spends too much time diagnosing and not enough time repairing, schedule a conversation with our team to see how Dovient shortens the path from breakdown to fix.