Google told what is meant crawl budget
Analyst, Department of quality search Google Gary Illyes published a detailed post dedicated to crawl budget. In it he explained what is term meant, what factors affect to crawling budget, what is the scan speed and crawling demand.
According to Gary, for most sites crawling budget is what about they just do not need to worry. Pay attention to the crawl budget of the site must will be only for large sites.
Prioritizing what to crawl, when, and how much resource the server hosting the site can allocate to crawling is more important for bigger sites, or those that auto-generate pages based on URL parameters, for example. - said Gary
A possible limitation in scan speed created to ensure that Google did not parse too many pages too quickly. This avoids unnecessary load on the server.
Crawling demand is the amount of pages that Google wants to scan. This figure is based on popularity of website pages and relevant content in the search index.
Crawling budget combines the scan speed and crawling demand. In Google under crawling budget understand the number of URLS that Googlebot is willing and able to scan.Factors influencing crawling budget
In Google found that the presence of a large number of low-quality pages can have a negative impact on its scanning and indexing. Below is the list of categories into which these pages are (in descending order of importance):
- Faceted navigation and session IDs;
- Page that returns a 404 soft error;
- Hacked page;
- Low-quality and spam content;
- URL, creating an infinite space (such as calendars).
More about faceted navigation - https://docs.microsoft.com/en-us/azure/search/search-faceted-navigation
Wasting server resources on these pages will lead to a reduction of the scanning activity is really valuable pages. Ultimately, this can lead to the fact that the quality of the content of the website will be indexed with a delay.
Link to crawling budget FAQ - https://seoheronews.com/google-crawl-faq