Monday, December 1, 2008

what are all the google bots?

Well, beside the regular googlebots, there's


MediaBot - used to analyze AdSense page

suser agent "Mediapartners-google"


ImageBot - crawling for the Image Search

user agent "googleBot-Image"


AdsBot - checking AdWords landing pages for quality

user agent "AdsBot-google"


Regarding Adwords

This bot"googleBot/2.1" has been crawling one page of my site daily. The page being crawled is the target of an adwords campaign. So Adwords is definitely looking at target page quality. Note the upper case "Bot". This is ("googleBot/2.1") the entire referrer string as well. It has used multiple IP addresses.This bot ("googleBot/2.1") also detected a change to one of my robots.txt files (more liberal for google) and appeared to almost immediately trigger a deep complete crawl of the site from the conventional bot. I believe all the bots cooperate in collecting the robots.txt file.

there is another google bot use in mobile now:

Google Wireless Transcoder
The conventional bot has a lower case "bot""Mozilla/5.0 (compatible; googlebot/2.1; +http://www.google.com/bot.html)"


google has made it a little difficult to sort out all of their bots with one search string "/2.1" works but does find some extra unrelated odds and ends in logs. (Except of course googlebot-Image/1.0)


Referrer strings extracted directly from my logs


Adsense: "Mediapartners-google/2.1"

Adwords: "googleBot/2.1"google-: "

Mozilla/5.0 (compatible; googlebot/2.1;

+http://www.google.com/bot.html)"

Image--: "googlebot-Image/1.0"


Of course as mentioned, there's Froogle,Mobile,and Feed for RSS feeds.


Also google now appears to be exclusively using HTTP/1.1, until recently there has been a mix of HTTP/1.1 and HTTP/1.0. One thing important to note is google is now always requesting GZIP compressed content if your server provides it. Your website might get an "attaboy" if you served GZIP compressed content to the "bots". This could cut your website's bandwidth usage and boost your site's performance.

0 comments: