A reliability engineer needs to find every instance of bearing failure on the cooling tower fan motors over the past three years. They open the CMMS and type "bearing failure cooling tower fan" into the search bar. The system returns 4 results. The engineer knows there have been at least 12 bearing replacements, because they personally worked on 3 of them. The rest are buried under descriptions like "CT-3 fan motor rebuilt," "replaced worn components on CT fan," and "annual PM - cooling tower - replaced consumables." None of those work orders mention the word "bearing" even though that is exactly what was replaced.
This is the fundamental problem with keyword search in maintenance. The information exists in your system. You just cannot find it, because the people who entered it used different words than the people searching for it. Natural language search fixes this. Instead of matching keywords, it understands meaning. This article explains how it works, why it matters, and what it looks like in practice.
Why Keyword Search Fails for Maintenance
Keyword search works by matching the exact words you type against the exact words stored in the database. If the words match, you get a result. If they do not, you get nothing. This works well for structured data like part numbers or equipment tags. It fails badly for unstructured text like work order descriptions, repair notes, and operator logs.
The reasons are specific to maintenance:
- Inconsistent terminology. One technician writes "bearing," another writes "brg," a third writes "roller element." One says "replaced," another says "changed out," a third says "renewed." The same repair gets described 10 different ways depending on who writes the work order.
- Abbreviations and shorthand. "Rplcd mech seal on P-107, instld new impeller, checked alignment" is a perfectly clear note to another technician. A keyword search for "mechanical seal replacement" will never find it.
- Context-dependent meaning. "Pump running hot" could mean the motor is overheating, the bearings are failing, or the process fluid temperature is elevated. A keyword search returns all three, and the user has to sort through them. A semantic search understands from context which meaning applies.
- Implicit information. A work order that says "rebuilt gearbox per SOP-GB-012" does not list the individual components replaced. But SOP-GB-012 specifies replacing the bearings, seals, and gear set. A keyword search does not follow that chain. A semantic search can.
- Multilingual descriptions. In plants with multilingual workforces, work orders may contain a mix of English, Spanish, Portuguese, or other languages. Keyword search treats these as separate universes. Semantic search understands meaning across languages.
The result is that most plants can only find 30-50% of their relevant maintenance records through keyword search. The rest exist but are invisible. This means that failure pattern analysis is incomplete, repeat repairs go undetected, and tribal knowledge captured in text form is effectively lost. For more on how to prevent that knowledge loss, see our guide on building a maintenance knowledge base.
How Semantic Search Works
Semantic search replaces keyword matching with meaning matching. Instead of asking "do these words appear in that document?", it asks "does this document talk about the same thing the user is asking about?" The technical mechanism involves three concepts: embeddings, vector similarity, and retrieval.
Embeddings: Turning Text into Meaning
An embedding is a mathematical representation of what a piece of text means. Think of it as coordinates on a map of meaning. The word "bearing" and the phrase "roller element" are different strings of characters, but they end up at nearly the same location on the meaning map because they refer to the same thing.
When your maintenance documents are loaded into a semantic search system, every paragraph, work order, SOP section, and repair note gets converted into an embedding. This happens once, up front. The system reads the text and generates a set of numbers (typically 768 or 1536 numbers) that represent what that text is about. Two pieces of text that are about similar topics will have similar numbers.
This is the key difference from keyword search: the system understands that "pump mechanical seal replacement" and "changed out the mech seal on the centrifugal" are about the same thing, even though they share almost no words in common.
Vector Similarity: Finding the Closest Match
When a user types a query, that query also gets converted into an embedding. Then the system compares the query's embedding against every document embedding in the database, looking for the closest matches. "Closest" here means the documents whose meaning is most similar to the query's meaning.
The math behind this is straightforward: it is cosine similarity, which measures the angle between two vectors. But you do not need to understand the math to understand the result. If you search for "bearing failure on cooling tower fans," the system returns documents about bearing problems on cooling tower fans, even if those documents use words like "brg replacement," "roller element degradation," or "CT fan motor rebuild."
Retrieval: Getting the Right Results
Finding similar documents is only the first step. A good semantic search system also needs to rank results by relevance, filter by metadata (date range, equipment type, work order status), and present the results in a way that is useful. If you search for bearing failures on cooling tower fans, you want the results sorted by relevance, not by date. You want to see the most relevant paragraph from each document, not just the document title. And you may want to filter to only corrective work orders, excluding PMs.
The combination of semantic understanding and metadata filtering is what makes the system practical for maintenance. You get the meaning-based search that finds everything, plus the structured filtering that narrows it down to what you actually need.
Real Query Examples
Abstract explanations only go so far. Here are real queries that maintenance teams run, with the results from keyword search versus semantic search.
Query 1: Finding a past repair procedure
User types: "How did we fix the heat exchanger tube leak last summer?"
Keyword search result: 0 results. The search engine looks for the exact phrase "heat exchanger tube leak" and the word "summer." No work order contains that exact combination.
Semantic search result: 3 results, ranked by relevance. The top result is a work order from July 2025: "Plugged 2 tubes on HX-301 shell side, hydrotest confirmed no further leaks." The second result is a SOP for tube plugging procedures. The third is a repair log from a different heat exchanger with a similar issue. The system understood that "tube leak" and "plugged tubes" are related, and that "last summer" means approximately June-August 2025.
Query 2: Finding failure patterns across equipment
User types: "All mechanical seal failures on centrifugal pumps in the past 2 years"
Keyword search result: 8 results. Only finds work orders that contain both "mechanical seal" and "centrifugal pump."
Semantic search result: 23 results. Finds the same 8, plus 15 more that use terms like "mech seal," "shaft seal," "seal replacement," "pump rebuild" (where seal replacement was part of the rebuild), and work orders on pumps that are centrifugal but whose description just says "pump" without specifying the type (the system knows from the equipment hierarchy that P-107 is a centrifugal pump).
Query 3: Safety information
User types: "What are the lockout procedures for the ammonia compressor?"
Keyword search result: 1 result. The LOTO procedure document that mentions "ammonia compressor" in the title.
Semantic search result: 4 results. The LOTO procedure, plus a safety bulletin about ammonia handling during compressor maintenance, a work permit template specific to ammonia systems, and a past incident report where the isolation procedure was updated after a near-miss. All of these are relevant to safely working on the ammonia compressor, but only one contained the exact keywords.
Search vs Browse: When Each Matters
Semantic search is not a replacement for structured browsing. Both have a place in maintenance information management.
Use search when: You know what you are looking for but not where it is. "What was the torque spec for the reactor agitator coupling bolts?" "Has this pump ever had cavitation damage?" "What is the isolation procedure for the steam header?" These are questions where you know the topic but need to find the specific document or data point.
Use browse when: You want to see everything in a category. "Show me all PMs due this week." "Show me all open work orders on the boiler house." "Show me all SOPs for the packaging department." These are navigation tasks, not search tasks. A well-organized folder structure or category system serves these better than search.
The best systems offer both. You browse to the boiler house equipment list, then search within that context: "Any vibration issues on the feedwater pumps in the last 6 months?" The browsing narrows the scope, and the search finds the specific information within that scope.
What Changes in Daily Work
Semantic search changes how several roles interact with maintenance information:
For the Technician
Instead of opening five PDFs and scrolling through 400-page manuals, the technician asks a question and gets an answer. "What is the clearance spec for the wear rings on P-107?" returns the exact specification from the OEM manual, plus a note from a past repair that the last technician found the OEM spec too tight and used 0.002" additional clearance successfully. This is the kind of practical detail that lives in repair logs but nobody ever finds through keyword search. For a look at how AI takes this a step further with actual diagnosis, see our article on AI-powered repair diagnostics.
For the Reliability Engineer
Failure pattern analysis becomes comprehensive instead of partial. When the engineer searches for all bearing failures on cooling tower fans, they get all 12 instances instead of 4. This means the MTBF calculation is accurate, the failure mode distribution is correct, and the PM frequency recommendation is based on complete data. Partial data leads to wrong conclusions. Complete data leads to correct ones. If you are building formal failure analysis, our guide on AI-generated FMEA covers how AI uses this complete data set to produce thorough failure mode analyses.
For the Planner
Searching for similar past work orders to estimate job duration, parts needed, and craft requirements becomes fast and accurate. "Show me similar jobs to this valve repacking" returns the 10 most similar past valve repacking jobs, with their actual durations, parts used, and complications encountered. The planner can build a realistic job plan based on actual data instead of guesswork.
For the Manager
Answering questions like "How much did we spend on pump seal failures last year?" no longer requires a manual trawl through the CMMS. The semantic search finds all seal-related work orders (regardless of how they were described), and the manager gets a complete cost picture.
Data Requirements
Semantic search works with whatever data you have. You do not need to clean up your work orders, standardize your terminology, or restructure your documents. The system handles the mess. That said, the quality of results improves with better data:
- Minimum useful data: 6 months of work order history, a set of SOPs, and OEM manuals for your critical equipment. This is enough to get started and see meaningful results.
- Good data: 2-3 years of work order history, complete SOP library, OEM manuals, and some tribal knowledge entries. This gives the system enough patterns to understand your plant-specific terminology and failure modes.
- Excellent data: Everything above plus detailed repair logs, video SOPs transcribed to text, operator round data, and condition monitoring reports. At this level, the search system becomes a comprehensive knowledge tool that can answer almost any maintenance question from your plant's own history.
The important point: you do not need excellent data to start. Start with what you have. The system gets more useful as you add more content, but it provides value from day one with even a modest knowledge base.
Common Objections
"Our CMMS already has search"
Yes, keyword search. Which finds 30-50% of relevant records. Run a test: pick a failure mode you know well, search for it in your CMMS, then manually count how many records the search missed. The gap is usually large enough to make the case by itself.
"Our data is too messy for AI"
Semantic search is specifically designed for messy data. It works with abbreviations, misspellings, inconsistent terminology, and incomplete descriptions. It does not need clean data because it understands meaning, not just words. The messier your data, the bigger the gap between keyword search and semantic search, and the more value you get from switching.
"How long does it take to set up?"
Initial indexing of your documents takes hours to days, depending on volume. Once indexed, the system is live. There is no manual tagging, no taxonomy building, no cleanup required. You point it at your data sources, it indexes everything, and users can start searching immediately. Adding new content (new work orders, new SOPs) happens automatically as the data is created.
Where Dovient Fits
Dovient's MissingDots engine uses semantic search as its foundation. Every question a technician asks the AI Copilot, every failure pattern a reliability engineer investigates, and every knowledge base query runs through the same semantic search infrastructure.
The system indexes your work orders, SOPs, OEM manuals, repair logs, tribal knowledge entries, and any other text-based maintenance documentation. Search results include source attribution so you can verify where the information came from. And the system improves continuously as new documents are added and users provide feedback on result relevance.
To see how semantic search works with your plant's documentation, schedule a conversation with our team.