PRO Sitemaps Crawler Bot
PRO Sitemaps service uses crawler bots (also known as "robots" or "spiders") in order to find out the structure of the website and create the list of website's pages in different formats, known as "sitemaps".
Our crawler bot sends individual requests to load website pages, analyzes response, finds internal links on those pages and continues the process until no more new pages can be discovered, unless restricted in some way by the website owner.
Crawling process is initiated on request of the website entry owner, according with the settings in their PRO Sitemaps Configuration, allowing to restrict the schedule for the crawler and limit the crawling rate (the number of request per time interval).
Crawler Bot respects directives found in the website's robots.txt file, ignoring all disallowed URLs. Additionally our crawler bot respects "robots" meta tags and rel="nofollow" tag attributes found in the webpage's source code.
Crawling rate will be automatically reduced upon receiving 429 Too Many Requests HTTP response code.
In order to allow efficient crawling process, our crawler bot uses multiple servers to send requests. Our main crawler servers are located in the UK and current list of server's IP addresses can be found in this file and is listed here:
85.92.66.149 81.19.188.235 81.19.188.236 85.92.66.150
Our bot uses the following "User-Agent" identification header:
Mozilla/5.0 (compatible; Pro Sitemaps Generator; pro-sitemaps.com) Gecko Pro-Sitemaps/1.0
Please contact us if you need any more details or have a request regarding the functioning of our crawler bot.