Escaped Thoughts

Sat, May 31, 2003

Google is Watching

I was going to set up a Google news scraper for my personal start page, so that I can get just the headlines, and just the topics that interest me (the sports section is a waste of screen space for me). I ran into a strange 403 Forbidden accessing it from python, though; it turns out that Google blocks access by User-Agent, to discourage exactly what I was trying.

It took about 30 seconds to find code to set the User-Agent in python's urllib class... Ironically, I found it using Google itself. However, after reading about how Google bans IPs that are caught doing any kind of automated queries, I'm afraid to use my workaround. Assuming they are using some sort of semi-decent AI to look for automation, regenerating my start page every hour of every day is likely to be noticed, and I would be lynched if Google banned CWRU. So I guess that will have to wait until I have a server where I can build the pages dynamically on request, instead of at set intervals. It's irritating, since this is definitely 'fair-use', since it's entirely for my own use, but I guess I understand the need for something like this.

The scary thing is, it seems like there is tremendous capacity for abuse in this draconian enforcement. What's to stop me from picking a victim IP address or domain, and sending tons of automated queries with a spoofed IP address? Bam, instant Google ban, quite possibly for the whole domain. In addition to random acts of evil, this could have really nasty corporate sabotage implications:

  1. Step 1: Open a local ISP
  2. Step 2: Pick a competing ISP, and ensure that Google bans their entire subnet
  3. Step 3: Profit! Watch the customers leave the ISP for one where they can actually perform Google searches

I'm beginning to understand why some people fear Google's power.

Category: Geek

Writebacks (0)

Two Paths Diverged

I've recently been getting the feeling of drifting away from all my old friends. It's hard to know how much is the distance, and how much the changes in our lives. It's depressing to realize how little I know about the last few years of many people's lives. Which makes it harder to talk, since there isn't that same common base of experiences/knowledge that I always took for granted when I still lived in the same town as all of my friends.

The real problem, I think, is that even with all of the advances in ease of communication that the internet has brought to us, I still haven't found a good virtual analog to hanging out.

As one of my now-departed college friends so insightfully pointed out: even if we could play Apples to Apples online, we'd still need virtual M&Ms.

Category: Life

Writebacks (0)