Document Search with Extractive QA
Document Search with Extractive QA
Yext's Document Search algorithm searches through long-form unstructured documents (like blogs, bios, support articles, product manuals) and returns search results based on relevance to the query — it will even deliver a direct answer in the form of a featured snippet! Learn how Yext can help answer your customers' questions, the modern way.
Search Unstructured Data
In search, your goal is to get users the answers to their questions as seamlessly as possible. The problem is that your data exists in different structures. Some data is highly structured in your CMS, some in semi-structured FAQs, while other data is completely unstructured like a blog, help article, or web page. Document Search is an algorithm specifically optimized for searching long documents, to help you deliver answers to your customers.
Surface Content Stored in Files
Document Search is able to index the unstructured content within a file (meaning the words in the PDF, powerpoint or document) and surface the relevant results and snippets in response to a query. This offers flexibility if you prefer to store an entire file in Yext Content, instead of extracting and storing the text as its own field on an entity.
Provide Direct Answers Using Extractive QA
Extractive QA, like many other parts of the Search Algorithm, uses Google's open source neural network, BERT, which is trained to understand language. In Extractive QA, a specially trained version of BERT helps to identify the excerpts from long documents that best answer the user's question. If it finds a good answer in the text, the algorithm displays it as a featured snippet.
In-line Snippets
In addition to providing a direct answer at the top of the page, Extractive QA also highlights relevant context as in-line snippets on each result down the page. These snippets provide a great search experience and help searchers find the best answers to their queries, even faster.
Capture Unstructured Data with Connectors
Unstructured data can be found across a number of different environments or sites. Places like your blog, your support site, or your press release hub may host a ton of great content that just isn't set up for modern search. Connectors like the Yext Crawler or Zendesk Connector can pull long-form unstructured data from apps or sources like your website so that you can build out your CMS and use it to start powering Search!
Explore Related Features
Explore Related Features