Archive for February, 2006

Blog measurement

Thursday, February 9th, 2006

Statistics about blogs are all the rage at the moment.

Steve Rubel says it is time that we had standards so trends can be gauged and advertisers can be appropriately gouged. He is right, but his evidence for the existence of a problem is flawed.

He complains that a study in November 2004 put readership at 58% while a
different more recent study put it at 20%.

Actually, the BBC 2004 report says readership is up by 58%, taking it to 27%
of online Americans. The article says 120 million Americans were online in
2004, which is about 40% of the population. That means that study was
saying 11% of the population had read a blog.

Even without measurement standards, the increase from 11% “have read a blog”
to 20% read blogs “frequently” or at least “occasionally” seems fairly
consistent and believable over the space of a year or so.

One of the challenges for any sort of automated measuring is going to be spam detection. Here is an example of a real blog that has been left unattended for a while. Detecting pure link-farms seems to be a hard enough problem in search. Removing spam noise from a site that contains a mixture of real and spam comments has to be harder.

Can somebody explain read.io to me?

Wednesday, February 8th, 2006

Am I missing the point, or is it kind of … ummm … well, stupid?

It is not live yet, so I have not seen it and might be missing an important detail. As I understand it, you sign up and it takes entries from the feed of your blog and converts them into synthetic speech for others to download.

Here is a sample

I can see why that might be fun once, but except for your blind blog readers, I cannot see why it is a good thing. Even if you have a significant number of blind readers, they presumably already have screen reading software which is doing the conversion at the client end where it should be being done. Who would rather download a 5Mb MP3 file than a 5 Kb html file?

I could kind of see some value in it if you could use it as an aggregator, grabbing new entries from a bunch of blogs that you read regularly and filling your MP3 player with them overnight so you could listen to them on the way to work. I still think the conversion should be done at the client end, but at least I could see a point to it. I don’t see why offering your readers the opportunity to download a stilted, machine converted podcast of your writings would have more than novelty value.

Waterfall 2006 CFP Open

Tuesday, February 7th, 2006

If you are not doing anything on April 1st, and think adopting a Pig Latin naming convention could help your code progress up the Job Security Index for Software Measurement, take a look at:

Waterfall 2006

The IT Crowd Episode 2

Monday, February 6th, 2006

The show is heading in the right direction. The second episode is funnier than the first.

Strangely, although the official website still quaintly tells you to “CTRL+ALT+DELETE your TV and watch the IT Crowd online!” the downloads seem to have disappeared.

Atariboy has a list of Torrents.

Update 8/2: Episode 3 is out and pretty good, but it is not available from channel four unless you are in the UK.

Online vs. Traditional Media Speed

Monday, February 6th, 2006

I saw something interesting today. For the first time I can recall, I read a nerd news story on the website of a local newspaper before I read it on Slashdot.

In this case, the paper beat them by about 14 hours.

I don’t know if this is a coincidence, or if it is a sign that online news sources are forcing old media to become more responsive. Factors in Slashdot’s defence include timezone and the fact that it happened on Superbowl weekend. (Do nerds watch the Superbowl now the ad breaks are not full of dot-coms).

Are site based search engines completely useless?

Monday, February 6th, 2006

I never use the search box on websites. This site has one, because WordPress provides it by default and it would seem silly to remove a possibly useful feature. Most big name websites have one. Jacob Nielsen says “Search is the user’s lifeline for mastering complex websites.”

I’m coping OK without a lifeline and I have been for a few years. In fact, I am doing better than OK. I am wasting less time looking through dud results and instead, finding what I want through an external search engine, which today of course means Google.

This is not really a criticism of the people who make plug in site search engines, or even people who choose to make their own bespoke ones. Making an internal search engine that works when people search for a phrase that is in the document heading or for uncommon keywords in the document body is pretty easy. Making one that works well when people search for common words, use different vocabulary or misspell words is much harder. You could employ an army of semi-literate squirrels, to pour over a thesaurus and add meta data full of misspelled synonyms to every page, or you could just accept the fact that an army of semi-literate webmasters have already linked to your important pages using their own descriptions and allow Google to harvest that meta data.

Adding a search box is easy. Giving good results with variable input is hard.

Every new internet technology can be used for vanity searches

Thursday, February 2nd, 2006

From Technorati: Blog posts that contain Luke Welling per day for the last 30 days.
Technorati Chart