Semantic Text Search

Searching Semi-Structured FAQs
The Problem with Keyword Search

A great search engine understands a user’s intent and returns the most relevant results. Modern search engines, like Google, Bing, and DuckDuckGo, are really good at this. However, most search technologies fall short because they only look at the words in the query, not the intent behind it. They use “keyword search,” which has been around for more than two decades.

But keyword search has a major flaw: humans use different words to ask the same questions. Consider the following examples:

The users behind these searches have similar or the same intent, and a keyword-based approach will fail to provide the most relevant results for these searches that are looking for the same information.

Keyword-based systems often employ techniques including TF-IDF, synonyms, stemming, and lemmatization to improve results, but these hacks are time-consuming and error-prone, and they still do not get to the intent of a user’s search. The user will likely get a list of links that have at least one keyword match, and they will have to search for the relevant information themselves, if it is even there.
How Semantic Text Search works

We trained Google’s BERT to better understand what a customer is really looking for. This means that, unlike keyword-based systems, Answers knows that someone searching “send back shoes” is looking to kick off a return process, and that someone asking about a “dislocated shoulder” probably needs an orthopedist.

Answers analyzes the meaning behind the query and finds the FAQs that have the most relevant answers to that query. It uses neural networks to understand precisely what the user is looking for and find the most relevant results from the Knowledge Graph.
Yext fine-tuned BERT to better understand search queries. When a user enters a query, Answers encodes it as a vector, creating a numerical representation of the query. Then, it looks for the FAQs that are closest to the query in vector space.
Two vectors that are close in space share meaning more than two that are far apart. Semantic Text Search places a query in vector space, and it locates the content that is closest to it, giving your customers the Answers they might not have realized they needed.

These vectors have 768 dimensions—which is impossible for us to wrap our heads around—but you can visualize the process in two dimensions here:

See how Yext can help you deliver official answers wherever people search so you can grow your business.