Like a good portion of the blogosphere (it sounds like) I have been playing with the new Google blog search – which technically speaking should be referred to as FeedSearch since feed content is the basis of the indexing – but no matter. Danny Sullivan has been playing as well apparently.
The index contains about 2/3 months worth of posts, and Google claims that it will be backfilled as time goes. Since many feeds only list the latest n posts (10, 20, 50 ?), I am not sure how they will be able to do so – besides scrapping blog pages or extracting posts from the cache of their main index (?).
A few things caught my attention:
- Tags and categories seem to be ignored in the indexing or the ranking algorithm.
- The content of the title field for both the blog and posts seems to have a high (if not disproportionate) degree of importance in the relevancy algorithm.
- Updates are caught up pretty quickly, I’d say in less than 10 minutes after an initial ping.
- Searching for your name gets your blog to appear as a featured one – nice.
The one thing I have the biggest issue with is that if you enter a keyword like “Venture Capital” in BlogSearch, a few blogs will be featured – and it is because they have “Venture Capital” in the blog title tag. Checking these, and I really don’t mean to pick on the bloggers writing them, they have a very limited number of incoming links, if any at all. And none of the “established” VC bloggers show up in this list.
What does that mean ? That anyone can create a blog that contains “Venture Capital” in the title, and after some time (?), that blog becomes feautured as an authority ? That reminds me of old (1995 ?) tricks of SEO, doesn’t it ?
Even though counting incoming links is not the best/sole measure of authority, it is better than patching a few keywords in a title tag. And how is PageRank playing (or not) here ?
For the sole purpose of testing, I have opened up a feed for one of my Blogger experimental blogs (Software Only – The Cousin), and have added a few keywords to the title to see if that blogs becomes featured for these keywords after a while. I will not leave it up more than one fortnight, since this is really close to spamming in my book.
Talking about spam actually: less than 5 mins after pinging Weblogs.com for the first time ever, my Blogger blog started to receive comment spam from automated bots – at least three times quicker than the Google crawler. Insane.
There are apparently some issues of staleness, Richard reports that his feed has not been indexed for almost 4 months…
Let’s not forget that this is just "out of the oven", and can’t be perfect from the outset…