No Category Wabler on 15 Feb 2007 05:17 pm
The longest robots.txt file ever? or a list of WhiteHouse.gov’s hidden gems
While in the search for rules and regs on Robots.txt I found an interesting site called robotstxt.org. I feel that site may be a little outdated…
Three or four entries in google down, I see the Whitehouse.gov/robots.txt file has been indexed and is searchable if you are looking for robots.txt as your query string. Interesting.
So I have a look.
It is 1879 lines long.
It includes such things as:
Disallow: /barney/newmedia/text
Disallow: /barney/photoessay/beazley/text
Disallow: /barney/photoessay/text
Disallow: /barney/photoessay2/text
Disallow: /barney/text
Disallow: /barneycam/text
WTF!? What is Barney doing ANYWHERE near my Government? Get that purple bastard away from George W.! If we aren’t careful, they will get into a round of “I love you”, and W will forget to take the troops out of Iraq. OH WAIT!!
OK OK, I realize that Barney is another one of their dogs, but is it absolutely necessary to have pictures of the dogs on what is supposed to be the government’s homepage?
Apparently there is a WhiteHouse.gov focus group:
Disallow: /infocus/httpwwwwhitehousegovinfocusg/text
And mizz beazly has her own pages too!
Disallow: /missbeazley/text
So, I know that the government has to try to make the White House site fun and exciting, but if they don’t allow Internet Spiders like Google into the site, how are any of us supposed to ever find any of this stuff?
Sphere: Related Content

















































