{"id":664,"date":"2026-05-07T04:25:41","date_gmt":"2026-05-07T04:25:41","guid":{"rendered":"https:\/\/rejupillai.com\/?p=664"},"modified":"2026-05-07T05:28:08","modified_gmt":"2026-05-07T05:28:08","slug":"gemini-with-google-search","status":"publish","type":"post","link":"https:\/\/rejupillai.com\/index.php\/2026\/05\/07\/gemini-with-google-search\/","title":{"rendered":"Where are the Citations ? Decoding Gemini w\/ Google Search Grounding"},"content":{"rendered":"\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-large is-style-default\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"572\" data-id=\"635\" src=\"https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-1024x572.png\" alt=\"\" class=\"wp-image-635\" srcset=\"https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-1024x572.png 1024w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-300x167.png 300w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-768x429.png 768w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-1536x857.png 1536w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-2048x1143.png 2048w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/04\/Gemini_Generated_Image_91iygm91iygm91iy-1170x653.png 1170w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Before native LLM search tools existed, developers building RAG pipelines for the web had to wire up search infrastructure manually, building a &#8220;Chat with the Web&#8221; application followed a predictable, painful pattern. You&#8217;d spin up a Programmable Search Engine (PSE) or hit the Custom Web Search APIs, scrape the resulting URLs, parse the raw HTML, chunk the text, compute embeddings, throw it into a vector database, and finally prompt your LLM to generate an answer.<\/p>\n\n\n\n<p>It was a brittle, latency-heavy nightmare and hard to maintain. Today, we are witnessing a paradigm shift. Web scraping and building your own Search-RAG pipeline is becoming a thing of the past. The new standard? Native LLM Web Grounding.<\/p>\n\n\n\n<p>If you\u2019ve used Google\u2019s conversational AI apps like Gemini, Gemini Enterprise or NotebookLM, you\u2019re likely familiar with their built-in citations. But when building with <strong>Gemini API<\/strong>, you won&#8217;t find this data in the main text\u2014it&#8217;s tucked away in the response metadata and requires to extract and correctly display. In this post, we\u2019ll explore the mechanics of Gemini\u2019s built-in <a href=\"https:\/\/docs.cloud.google.com\/vertex-ai\/generative-ai\/docs\/grounding\/grounding-with-google-search\">Google Search Grounding<\/a>, and dissect how to properly implement byte-indexed inline citations<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Search Grounding solutions in the market&nbsp;<\/h2>\n\n\n\n<p><a href=\"https:\/\/docs.perplexity.ai\/docs\/sonar\/quickstart\">Perplexity Sonar<\/a> is a great solution for the most polished grounded-answer experience with citations, <a href=\"https:\/\/docs.cloud.google.com\/vertex-ai\/generative-ai\/docs\/grounding\/grounding-with-google-search\">Google Gemini Search Grounding<\/a> is best if you are already in the Google Cloud ecosystem and want enterprise-grade search-backed grounding and of course if you like\/trust the quality of Google web search. <a href=\"https:\/\/exa.ai\/docs\/reference\/search\">Exa<\/a> is a stong developer-focused retrieval option for RAG and research pipelines, <a href=\"https:\/\/brave.com\/blog\/ai-grounding\/\">Brave AI Grounding<\/a>&nbsp; is the best low-cost privacy-friendly choice with straightforward pricing, and <a href=\"https:\/\/www.tavily.com\/\">Tavily<\/a> is a practical lightweight option for agent-style search workflows where quick integration matters most. <\/p>\n\n\n\n<p>While researching, I discovered that Gemini also <a href=\"https:\/\/docs.cloud.google.com\/gemini-enterprise-agent-platform\/models\/grounding\/grounding-with-exa\">integrates<\/a> with Exa, bringing quality and control sought after by many developers.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Gemini Grounding w\/ Google Search<\/h2>\n\n\n\n<p>Gemini Grounding with Google Search natively connects the model to real-time web data to reduce hallucinations. It outperforms traditional custom Web Search RAG pipelines by offering built-in dynamic retrieval (only searching when necessary), automated citation mapping, and a fully managed architecture that eliminates the need to build and maintain complex query generation and context-injection workflows.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Real-Time Knowledge: Accesses live web data to anchor responses in current events, bypassing knowledge cutoffs.<\/li>\n\n\n\n<li>Dynamic Retrieval: Uses a configurable threshold to intelligently search only when necessary (e.g., skipping searches for static facts).<\/li>\n\n\n\n<li><span style=\"text-decoration: underline;\">Native Citations: Returns a structured groundingMetadata object that precisely links the model&#8217;s text to source URLs.<\/span><\/li>\n\n\n\n<li>Enterprise Privacy: Supports VPC Service Controls and regional processing, ensuring queries are not logged or used for model training.<\/li>\n\n\n\n<li>Search Entry Points: Natively generates compliant HTML\/CSS Google Search suggestion chips for easy UI integration.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">How Google Search Grounding Works<\/h2>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-2 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"595\" data-id=\"674\" src=\"https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview-1024x595.png\" alt=\"\" class=\"wp-image-674\" srcset=\"https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview-1024x595.png 1024w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview-300x174.png 300w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview-768x446.png 768w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview-1536x893.png 1536w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview-1170x680.png 1170w, https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/google-search-tool-overview.png 1860w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/figure>\n\n\n\n<p><a href=\"https:\/\/ai.google.dev\/gemini-api\/docs\/googlesearch#how_grounding_with_google_search_works\">source<\/a> <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Implementing Inline Citations w\/ Gemini API<\/h2>\n\n\n\n<p><strong>Step 1 : To make grounding effective, you must instruct the model on how to behave.<\/strong><\/p>\n\n\n\n<script src=\"https:\/\/gist.github.com\/rejupillai\/eafe8cb452bf944f15685f8ead4eb893.js?file=1.py\"><\/script>\n\n\n\n<p>By setting <code>temperature=0.0<\/code> and applying this strict system instruction, we force the model into an analytical, verification-first mindset, drastically reducing the chance of ungrounded hallucinations.<\/p>\n\n\n\n<p><strong>Step 2 : Configure the API w\/ google_search tool call.<\/strong><\/p>\n\n\n\n<p>Unlike the legacy Custom Search API, there is no need to configure a search engine ID, manage a separate API key, or parse JSON snippets. You simply pass the tool to the configuration:<\/p>\n\n\n\n<script src=\"https:\/\/gist.github.com\/rejupillai\/eafe8cb452bf944f15685f8ead4eb893.js?file=2.py\"><\/script>\n\n\n\n<p><strong>Step 3 : Decoding the Grounding Metadata<\/strong><\/p>\n\n\n\n<p>When Gemini returns a grounded response, it populates <code>response.candidates[0].grounding_metadata<\/code>. This metadata contains three vital components:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong><code>grounding_chunks<\/code> (The Bibliography):<\/strong> A list of all the web sources (URLs and titles) the model retrieved and read.<\/li>\n\n\n\n<li><strong><code>search_entry_point<\/code> (Compliance):<\/strong> HTML content for a Google Search chip. Displaying this is a strict compliance requirement when using the Google Search grounding tool.<\/li>\n\n\n\n<li><strong><code>grounding_supports<\/code> (The Citations):<\/strong> This is the magic. It tells you exactly which segment of the model&#8217;s text is supported by which chunk in the bibliography.<\/li>\n<\/ol>\n\n\n\n<p><strong>Step 4 : The Complexity of Byte-Index Insertion<\/strong><\/p>\n\n\n\n<p>The most technical part of the provided code is the <code>add_inline_citations<\/code> function.<\/p>\n\n\n\n<p>The <code>grounding_supports<\/code> object provides an <code>end_index<\/code> for where a citation should be placed. However, <strong>these indices are calculated using bytes, not characters<\/strong>. If a generated response includes emojis or special characters (which take up multiple bytes in UTF-8), standard Python string slicing will place the citations in the wrong spot, breaking your text.<\/p>\n\n\n\n<p>Here is how the code safely solves this:<\/p>\n\n\n\n<script src=\"https:\/\/gist.github.com\/rejupillai\/eafe8cb452bf944f15685f8ead4eb893.js?file=3.py\"><\/script>\n\n\n\n<p><strong>Why sort in descending order?<\/strong><\/p>\n\n\n\n<p>If we insert citations from the beginning of the text (left-to-right), inserting the first citation will increase the length of the string, shifting all subsequent byte indices and causing the remaining citations to be placed incorrectly. By iterating backward (right-to-left), we can safely mutate the byte array without affecting the index positions of the text that comes before it.<\/p>\n\n\n\n<p><strong>Step 5 : Build the search entry point <\/strong><\/p>\n\n\n\n<script src=\"https:\/\/gist.github.com\/rejupillai\/eafe8cb452bf944f15685f8ead4eb893.js?file=4.py\"><\/script>\n\n\n\n<p><strong>Step 6 : Build the Bibliography of sources <\/strong><\/p>\n\n\n\n<script src=\"https:\/\/gist.github.com\/rejupillai\/eafe8cb452bf944f15685f8ead4eb893.js?file=5.py\"><\/script>\n\n\n\n<p><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Demo App<\/h2>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-3 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"940\" data-id=\"670\" src=\"https:\/\/rejupillai.com\/wp-content\/uploads\/2026\/05\/citation-3.gif\" alt=\"\" class=\"wp-image-670\"\/><\/figure>\n<\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">References &amp; <\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Github <a href=\"https:\/\/github.com\/rejupillai\/gwgs.git\" data-type=\"link\" data-id=\"https:\/\/github.com\/rejupillai\/gwgs\/blob\/main\/README.md\">Link<\/a> to setup the Demo code <\/li>\n\n\n\n<li>Google <a href=\"https:\/\/docs.cloud.google.com\/vertex-ai\/generative-ai\/docs\/grounding\/grounding-with-google-search\">documentation<\/a> <\/li>\n\n\n\n<li><a href=\"https:\/\/cloud.google.com\/gemini-enterprise-agent-platform\/generative-ai\/pricing\">Pricing<\/a> <\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Before native LLM search tools existed, developers building RAG pipelines for the web had to wire up search infrastructure manually, building a &#8220;Chat with the Web&#8221; application followed a predictable, painful pattern. You&#8217;d spin up a Programmable Search Engine (PSE) or hit the Custom Web Search APIs, scrape the resulting<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[],"class_list":["post-664","post","type-post","status-publish","format-standard","hentry","category-aiagents","ct-col-2"],"_links":{"self":[{"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/posts\/664","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/comments?post=664"}],"version-history":[{"count":13,"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/posts\/664\/revisions"}],"predecessor-version":[{"id":682,"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/posts\/664\/revisions\/682"}],"wp:attachment":[{"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/media?parent=664"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/categories?post=664"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rejupillai.com\/index.php\/wp-json\/wp\/v2\/tags?post=664"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}