Where Do Google Gemini, ChatGPT, and Other AI Models Get Local Business Information?

Trent Ruffolo headshot

Trent Ruffolo

Jun 13, 2024

5 min

AI search is reshaping how users find information online, particularly with the recent launch of Google's AI Overviews (formerly known as SGE during beta). As a result, brands need to adjust how they manage their information. Maintaining a robust and optimized digital presence across all platforms is essential to leveraging the latest advancements in AI search technology.

That's because AI models, like Google Gemini and ChatGPT, do not only rely on major directories and brand websites to source their information about local businesses. In reality, they draw from a wide range of sources to deliver comprehensive and accurate results. How do we know? Well, what better way to get to the bottom of this than to ask each major AI model, "where do you get your local business information from?" Let's dive in.

Bing Copilot

When asked about its sources, Bing Copilot said it gathers local business information from various places, including websites and online directories.

When asked a follow-up question about which online directories it uses, Bing Copilot provided a list of some of the directories it uses, which means that this list is not exhaustive.

It was no surprise that it listed off main platforms like Google, Yelp, Bing, and Foursquare. However, it also mentioned sites that most people would consider to be less prominent, such as Yellowbook, Superpages.com.au, and CitySearch. (Interestingly, Yext was even listed at number seven as a trusted source, underscoring our reputation for providing reliable local information.)

ChatGPT

ChatGPT listed several sources, including online directories. However, we wanted to know specifically which websites it uses, and this was the list it gave us:

The list includes prominent sites like Google, Yelp, Bing, and Facebook in addition to smaller websites like Hotfrog, Zomato, BBB, and others.

We asked if this was the exhaustive list of online directories, or if there were others it used to gather local business information. This was the response to that question:

You can see that ChatGPT listed off another 15 websites that it uses – many that would be considered as smaller publishers – such as MerchantCircle, DexKnows, Insider Pages, and Superpages. ChatGPT concludes by saying that this extensive coverage of online directories provides it with detailed information about local businesses. This breadth of sources demonstrates the model's effort to provide detailed and accurate business information.

Publishers have verified this independently, too. "We can confirm that ChatGPT is scraping our information," says Bob Gross, founder of Locally. "We know that this has been happening, but it just re-confirms how important it is for brands to have a presence on diverse publishers of all sizes." Multiple other Yext publisher partners including Wogibtswas.de, Tupalo, Central Index, and Scoot also said that ChatGPT is scraping their information and echoed the same sentiment.

Perplexity

Once again, it's no surprise that when asked where it gets its local business information from, Perplexity said it references various sources, including business websites and online directories.

So, we dug deeper and asked how many search engines and online directories it uses. Perplexity provided a direct answer, stating it uses roughly 15 sources and citing some unexpected websites: Yellow Pages, Baidu, Yandex, and DuckDuckGo. As Perplexity stated, "diversifying my sources ensures comprehensive coverage."

Google Gemini

Google Gemini reported using Google My Business (now known as Google Business Profile), business websites, and other online directories. Not much context was given as to which directories it uses besides Google, so we asked a question that was specific to a local business. To see exactly where it sourced its information, we asked what time the McDonald's at 39 Union Sq in NYC opens:

The answer that Google Gemini gave was sourced not from Google Business Profile or the McDonald's website – it came from a Tripadvisor listing. This is impactful because this McDonald's location has a listing on Google and a structured location page, yet Google Gemini still sourced a site like Tripadvisor for this information, indicating that Google Gemini relies on a diverse range of sources for accurate business information.

Meta AI

Meta AI also says that it gets its local business information from a variety of data sources. In addition to business websites, social media platforms, and online directories, Meta AI listed Yext as the only location management provider under Data Partnerships, highlighting our unique position as a trusted data source.

While this is just a handful of examples, we've looked at hundreds of different business data answers from these platforms — and they all name dozens of sources for their data. (This reiterates some of the findings in our recent research report, which illustrates how an expansive digital presence — with perfectly synchronized data — sends the strongest possible signal to both search and AI systems.)

Conclusion

It's not a hallucination: AI models rely on many data sources, including:

  • Major search engines like Google and Bing

  • Brand websites and local pages

  • Smaller search engines and online publishers

While it may initially seem that AI models predominantly rely on major directories, a closer examination reveals a deeper reality. AI seeks breadth, accuracy, and consistency of information. Detailed analysis confirms that each model leverages an extensive network of publishers to ensure the accuracy and reliability of its information. This highlights the critical importance of a broad range of sources in validating and enhancing the data used by AI models.

Therefore, to put your brand in the best position to succeed in the AI-driven search landscape, it's important to maintain an optimized digital presence across all online platforms. This includes structured location pages on your website, major search engines like Google and Bing, and smaller search engines and online publishers.

By partnering with more than 200 maps, apps, search engines, and online directories, Yext has the industry's largest publisher network of direct API integrations. This provides businesses with an optimized digital presence on large platforms like Google, Facebook, Bing, Apple Maps, and Yelp, as well as the smaller search engines that AI models utilize to source information about local businesses.

After all, two of the top five AI models (Bing Copilot and Meta AI) listed Yext as a trusted data source without any prompting. They also didn't mention any other listings management providers, highlighting Yext's credibility as the most trusted solution on the market. It's clear that partnering with Yext is the best strategy for enhancing your brand's discoverability in the world of AI-driven search.

Interested in boosting your brand's search visibility with the world's most trusted and sophisticated listings solution? Learn more about how Yext Listings helps your brand show up everywhere customers search.

How does an extended publisher network impact brand discoverability? Click here to see the data.

Share this Article

Read Next

loading icon