ArXiv TLDR

How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews

🐦 Tweet
2604.27790

Riley Grossman, Songjiang Liu, Michael K. Chen, Mike Smith, Cristian Borcea + 1 more

cs.IRcs.AIcs.CLcs.CYcs.HC

TLDR

This study empirically compares Google Search, AI Overviews, and Gemini, showing how generative AI alters search results, source diversity, and website visibility.

Key contributions

  • AI Overviews (AIOs) appear for 51.5% of queries, often above organic results, especially for controversial topics.
  • Generative search sources differ greatly from traditional Google, favoring Google-owned content over popular sites.
  • Websites blocking Google's AI crawler are significantly less likely to be retrieved by AI Overviews.
  • AI Overviews show less consistency and robustness to minor query edits compared to traditional search.

Why it matters

This paper highlights critical shifts in web search with the rise of generative AI, impacting how users receive information and how websites gain visibility. It underscores the need for new revenue models to ensure a sustainable ecosystem for publishers and AI search providers.

Original Abstract

Generative AI is being increasingly integrated into web search for the convenience it provides users. In this work, we aim to understand how generative AI disrupts web search by retrieving and presenting the information and sources differently from traditional search engines. We introduce a public benchmark dataset of 11,500 user queries to support our study and future research of generative search. We compare the search results returned by Google's search engine, the accompanying AI Overview (AIO), and Gemini Flash 2.5 for each query. We have made several key findings. First, we find that for 51.5\% of representative, real-user queries, AIOs are generated, and are displayed above the organic search results. Controversial questions frequently result in an AIO. Second, we show that the retrieved sources are substantially different for each search engine (<0.2 average Jaccard similarity). Traditional Google search is significantly more likely to retrieve information from popular or institutional websites in government or education, while generative search engines are significantly more likely to retrieve Google-owned content. Third, we observe that websites that block Google's AI crawler are significantly less likely to be retrieved by AIOs, despite having access to the content. Finally, AIOs are less consistent when processing two runs of the same query, and are less robust to minor query edits. Our findings have important implications for understanding how generative search impacts website visibility, the effectiveness of generative engine optimization techniques, and the information users receive. We call for revenue frameworks to foster a sustainable and mutually beneficial ecosystem for publishers and generative search providers.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.