As Russel points out (original article gone), the quality of searches is dwindling at Google as the result of BlogNoise.

It seems to me that Google could easily cut out a lot of blogcrap of their search results if they performed their searches on a post by post basis (all words would have to be found in the same post) instead of a page by page basis (a weblog page contains an average of 15 very loosely related posts).

How would their indexer find out about the boundaries of each post on a page? Well… just let it take advantage of the RSS linked to any decent weblog!

Oh wait… Blogger blogs don’t have RSS! They have a huge market share (i-e a huge blogcrap share) and if they still haven’t implemented such a straightforward feature yet, they’re not very likely to do so soon… That’s a problem…

Semantic Web, where are thou?


Comments from long ago:

Comment from: fplanque: /dev/blog - Google & BlogNoise: the blogger’s responsiblity

[…] gn Google & BlogNoise: the blogger’s responsiblity We have talked about the annoying BlogNoise problem before. And most bloggers have agreed that Go […]

2003-05-17 20-58

Comment from: fplanque: /dev/blog - Google & BlogNoise: the blogger’s responsiblity

[…] gn Google & BlogNoise: the blogger’s responsiblity We have talked about the annoying BlogNoise problem before. And most bloggers have agreed that Go […]

2003-05-17 20-58

Comment from: fplanque: /dev/blog

Google & BlogNoise: the blogger's responsiblityWe have talked about the annoying BlogNoise problem before. And most bloggers have agreed that Google would probably be smart enough to fix the problem shortly in order to provide a better service to their users.

A great part of the BlogNoise is gener…

2003-05-17 20-59