Facebook vs robots.txt

Found this story through someone’s buzz updates. Pete Warden had built a service that crawled public Facebook profiles (public = not protected by robots.txt) and then displayed neat summary information about people’s social graphs (aggregated by country, things like who belongs to your inner circle etc).

In a rather evil move, Facebook decided to sue him! Their legal position was that even though the crawled pages are not protected by robots.txt, the legal validity of robots.txt has never been tested in courts.

Their contention was robots.txt had no legal force and they could sue anyone for accessing their site even if they scrupulously obeyed the instructions it contained. The only legal way to access any web site with a crawler was to obtain prior written permission.

I am willing to bet that Facebook itself has developed at least one web crawler in its short lifespan so far! Even if it hasn’t, should the business need arise, the last thing they will do is obtain written permissions from owners of websites they intend to crawl!

So here’s the irony: Pete knows he is right! Facebook knows Pete is right! But fighting lawsuits is so damn expensive in this country that a big company like Facebook can easily bully him into submission! Sadness.