OpenAI, the company behind ChatGPT, has been secretly accessing Google’s search data to improve its AI chatbot’s real-time responses, particularly in areas like news, sports, and financial markets. This comes despite Google’s formal refusal to grant OpenAI direct access to its search index.
The discovery emerged following a careful investigation that confirmed ChatGPT’s ability to retrieve current information from Google search results by using a third-party web scraping service called SerpApi, which pulls data directly from Google's search index.
How is ChatGPT getting real-time search results despite Google's refusal?
Former Google engineer Abhishek Iyer uncovered the practice by creating dummy web pages visible only within Google's search system. When ChatGPT responded with content from those pages, it proved the AI accessed Google's search index indirectly.
This was supported by others like Brian Dean of Backlinko, who tested with fabricated search terms indexed on Google. OpenAI does not access Google Search directly but contracts SerpApi, a Texas-based company, to scrape Google's search results and supply them.
This method bypasses Google's denial of OpenAI's direct access request, allowing ChatGPT to leverage Google's superior search capabilities without official permission.
Did you know?
OpenAI was listed publicly as a customer of SerpApi, a Google search scraping service, until May 2024 when the reference was quietly removed.
What are the implications of OpenAI’s use of Google’s search index?
Despite competing directly with Google in search technologies, OpenAI rents servers from Google Cloud to operate ChatGPT, showing a complex relationship between rivalry and reliance.
The arrangement illustrates Google's dominance as a data provider in AI ecosystems. OpenAI’s product lead, Nick Turley, admitted the company aims to generate most search traffic internally but acknowledged their current capabilities fall short, making Google's data essential.
Regulatory pressure, especially from the Department of Justice’s antitrust case against Google, might explain why Google hasn’t pursued legal action against SerpApi’s scraping practices.
ALSO READ | What’s Behind Meta and Google’s $10 Billion Cloud Collaboration?
OpenAI relies on third-party services for current data retrieval
SerpApi services are not exclusive to OpenAI; other major firms like Meta, Apple, and AI competitor Perplexity also use them. OpenAI was openly listed as one of SerpApi’s clients until mid-2024, when it discreetly removed its public mention.
These circumstances highlight the broader tech industry's dependence on Google’s data, even when firms publicly position themselves as competitors.
Google remains a key player in the AI infrastructure landscape
Google’s dual role as competitor and infrastructure provider creates tensions but also business pragmatism. Many AI innovations still rely heavily on Google's data and cloud platforms despite the competition.
As AI platforms like ChatGPT grow, this dynamic raises important ethical and commercial questions about transparency, data use, and competitive fairness in the tech sector.
The future might see regulatory shifts that force more openness in data sharing, leveling the field among AI companies.
Comments (0)
Please sign in to leave a comment