Discuz! Board

 找回密碼
 立即註冊
搜索
熱搜: 活動 交友 discuz
查看: 9|回復: 0

Understand if your site has a Crawl Budget problem

[複製鏈接]

1

主題

1

帖子

5

積分

新手上路

Rank: 1

積分
5
發表於 2024-3-7 14:14:38 | 顯示全部樓層 |閱讀模式
Once we have determined the site's crawl budget , i.e. how much work the Google bot does on our site and how often it visits us, we try to understand if the site actually has a crawl and therefore crawl budget problem .To quickly determine if your site has a crawl budget issue, follow these steps:determines the number of pages sent in sitemap (for example for our site there are approximately 650 pages in sitemap)determines the number of URLs crawled daily (as we saw before in our case it is approximately 512 pages / day)divide the number of pages in sitemap by the value of pages scanned , in our case it is 1.25. With this parameter we therefore have a value that tells us how often the Google bot returns to visit the same page on average.


if the value is high (>5), you may have a scanning problem!In our case Special Data  we have seen that in fact the Google crawler is able to scan the entire site on average in a couple of days at most, so no particular critical issues are identified.Having a value of > 10 could be different, this would mean that on average each new page (or change) on the site could take up to 10 days to be detected.Factors that influence the Crawl BudgetThe errors that negatively impact site scanning and therefore the Crawl Budget are essentially of the following type:Hacked pagesIt goes without saying that websites containing hacked content and possible threats to user security are obviously penalized and Google does not want to waste time crawling and indexing these pages. We always recommend updating websites by keeping the CMS and related plugins always up-to-date!Infinite spaces and proxiesWhen Googlebot crawls the web, it often finds what is usually referred to as “infinite space” .





This is a very large number of links that usually provide little or no new content for Googlebot to index. If this happens on your site, crawling those URLs may use unnecessary bandwidth and may prevent Googlebot from fully indexing the actual content on your site, wasting resources on low-value content.The classic example of “infinite space” is a calendar with links to the following months. Googlebot could continue to follow those “Next Month” links forever, still reaching pages of little value and content.Another common scenario is filters on ecommerce websites that allow you to view the same products in multiple ways. An online clothing site might allow you to select and filter clothing items by category, price, color, brand, style, etc. The number of possible filter combinations can increase exponentially . Using filters that create dynamic pages in this way can produce thousands of URLs, all capable of finding a subset of the items sold. This may be convenient for your users, but it's not so useful for Googlebot, which only wants to find everything – once!

回復

使用道具 舉報

您需要登錄後才可以回帖 登錄 | 立即註冊

本版積分規則

Archiver|手機版|自動贊助|GameHost抗攻擊論壇

GMT+8, 2025-4-7 19:52 , Processed in 0.029393 second(s), 18 queries .

抗攻擊 by GameHost X3.4

© 2001-2017 Comsenz Inc.

快速回復 返回頂部 返回列表
一粒米 | 中興米 | 論壇美工 | 設計 抗ddos | 天堂私服 | ddos | ddos | 防ddos | 防禦ddos | 防ddos主機 | 天堂美工 | 設計 防ddos主機 | 抗ddos主機 | 抗ddos | 抗ddos主機 | 抗攻擊論壇 | 天堂自動贊助 | 免費論壇 | 天堂私服 | 天堂123 | 台南清潔 | 天堂 | 天堂私服 | 免費論壇申請 | 抗ddos | 虛擬主機 | 實體主機 | vps | 網域註冊 | 抗攻擊遊戲主機 | ddos |