​​​​​​​​​​​​​​​​​         

Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Google On Search Console Noindex Detected Errors


Google’s John Mueller answered the question of the seemingly false ‘noindex detected in the X-Robots-Tag http header reported in the Google Search Console for pages that do not have the specific X-Robots-Tag or any other related Directive or block. Mueller suggested some possible reasons, and more Redditar provided reasonable explanations and solutions.

Noindex discovered

The person who started Reddit discussion He described a scenario that can be known to many. Google Search Console reports that she couldn’t index the page because it was blocked not the indexing of the page (which differs from blocked from crawling). The page check does not reveal any presence of noindex meta element and has no robot.txt that blocks the creep.

Here’s what is described as their situation:

  • “GSC shows” Noindex detected in X-Robots-Tag http header “for a large part of my URLs. However:
  • I can’t find noindex in HTML Source
  • No noindex in robots.txt
  • No noindex visible in response heads when testing
  • Live test in GSC shows a page as an indexible
  • The site is behind Cloudflare (we checked the Page/WAF rules, etc.) ”

They also reported that they tried to settle Googlebot and tested different IP addresses and asked for headers and still did not find a concept for the X-Robots Source-Tag Source

Cloudflare doubt

One of the Redditars commented on that discussion to suggest solving the problem if the problem arose from Cloudflare.

They offered comprehensive instructions for step by step how to diagnose if Cloudflare or anything else prevents Google to index the page:

“First, compare the live test page compared to the CRAWLED in GSC to see if you see Google’s outdated response. Then review Cloudflare rules, reply and modification workers. Use Curl with Googlebot User-Agent and Cache Bypass (Cache-Constol: No-Cache) Disable the SEO supplements for the dynamic header.

Op (original poster, the one who started the discussion) replied that they had tested all these solutions, but were unable to test the website cache via GSC, just a vibrant place (from the actual server, not Cloudflare).

How to test with the actual Googlebot

Interestingly, OP stated that they could not test their web site using Googlebot, but there is actually a way to do it.

Google’s rich results uses the Googlebot user agent, which also originates from the Google IP address. This tool is useful to check what Google sees. If the exploitation causes the site to display the concealed page, the test for rich results will reveal exactly what Google indexes.

Google’s Page to support rich results confirms:

“This tool accesses a page as Googlebot (that is, not using your credentials, but as Google).”

401 Error Response?

The following is probably not a solution, but it is an interesting part of the technical seo of knowledge.

Another user shared the server experience that responded with an answer to the error 401. The answer 401 means “unauthorized”, and it happens when the resource request lacks a credential to authenticate or credits not real. Their solution to creating a blocked messages in Google’s search console was the addition of notes to the robots.TXT to block the URL crawling.

Google’s John Mueller on GSC Error

John Mueller descended into discussion to offer his help to diagnose this problem. He said he saw that this issue appeared in relation to the CDNS (a content delivery network). Interestingly, he said he saw that it was happening with a very old URL. He did not work out the last one, but it seems to mean some kind of indexing of the error associated with the old indexed URL.

Here’s what he said:

“Happy to see if you want to ping me some samples. I saw him with CDNs, I saw him with a really old crawling (when it was a long time ago, and the site was just indexed a lot of ancient URLs), maybe it’s something new here …”

Key Match: The Google Search Noindex Conduction Index detected

  • The Google Search Console (GSC) can report “Noindex detected in the X-Robots-Tag http header”, even when the header is not present.
  • CDN -Owes, such as cloudflare, can interfere with indexing. The steps are divided to check if Cloudflare’s transformation rules, answers or cache as Google, see the page.
  • Completed indexing data on Google’s side can also be a factor.
  • Google’s rich results can check what Googlebot sees because he uses Googlebot’s user agent and IP, discovering deviations that may not be visible from the user agent spoo.
  • 401 Unauthorized answers can prevent indexing. The user split that their problem includes the application pages that were to be blocked via Robots.txt.
  • John Mueller suggested a CDN and a historically crawl URL.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *