Following the big dogs on web application security

December 21st, 2007

(This post originally appeared as part of the 2007 PHP Advent calendar)

At this time of year people are apt to get all warm and sentimental … Right up until their first trip to a mall on a Saturday when they go back to hating their fellow man and instituting an “If Amazon don’t sell it, you’re not getting it” policy on gift giving. December is very important to retail, and very important to retail sites.

I remember some good advice I read a long time ago. Vincent Flanders & Michael Willis in Web Pages That Suck suggested you “follow the big dogs”, in other words copy Amazon. Their reasoning was sound. You will likely get it wrong on your first try, you can’t afford to run usability studies of your own, and don’t want to spend months and numerous iterations getting it right. Learning from other people’s mistakes is always less embarrassing than learning from your own.

I have had to paraphrase here, because I opted to recycle nearly all my old books rather than ship them half way around the world. Had I wanted to check the accuracy of my quote, it would have cost me one cent to buy a second hand copy of that book.

While the long term relevance of most of the advice in old computer books is fairly accurately reflected by that valuation, it was good advice in 1998. If you were embarking on an ecommerce venture at a time when there was a shortage of people who knew what they were doing, best practice conventions were not settled and innovation was rapid there were worse philosophies you could have than “What Would Amazon Do?”

The same idea is popular today, and for the same reason. There is always a shortage of people who really know what they are doing, so there are plenty of people making decisions by asking “What Would Google/Amazon/Microsoft/eBay/PayPal/Flickr/Yahoo/YouTube/Digg/Facebook Do?” If you are in a space where nobody really knows the best way yet, copying the segment leader is a low risk, low talent shortcut to making mainly good decisions, even if does mean you are always three months behind.

The idea does not apply well to web application security. There are two main reasons for this: first, the big dogs make plenty of mistakes, and second, good security is invisible.

You might notice mistakes, you might read about exploited vulnerabilities and you might notice PR based attempts at the illusion of security, but you probably don’t notice things quietly being done well.

Common big dog mistakes include:

  • Inviting people to click links in email messages.
    You would think that, as one of the most popular phishing targets out there, PayPal would not want to encourage people to click links in emails. Yet, if you sign up for a Paypal account, the confirmation screen requests that you do exactly that.

    Paypal Confirmation Screen

  • Stupid validation rules.
    We all want ways to reject bad data, but it is usually not easy to define hard and fast rules to recognize it, even for data with specific formatting. Everybody wants a simple regex to check email addresses are well formed. Unfortunately, to permit any email that would be valid according to RFC2822, a simple one is not going to cut it. Which means that many, many people add validation that is broken and reject some real addresses. Most are not as stupid as the one AOL used to have for signing up for AIM, which insisted that all email addresses ended in .com, .net, .org, .edu or .mil, but many will reject + and other valid non-alphanumeric characters in the local part of an address (the bit before the @).
  • Stupid censorship systems
    Simple keyword based censorship always annoys people. Eventually, somebody named Woodcock is going to turn up.
    Xbox Live is infamous for rejecting gamertags and mottos after validating them against an extensive list of “inappropriate” words. Going far beyond comedian George Carlin’s notorious Seven Dirty Words, there is a list of about 2700 words that are supposedly banned. By the time you add your regular seven, all possible misspellings thereof, most known euphemisms for body parts, racial epithets, drug related terms, Microsoft brand names, Microsoft competitors’ brand names, terms that sound official and start heading off into foreign languages, you end up catching a lot of innocent phrases.
  • Broken HTML filtering.
    Stripping all HTML from user submitted content and safely displaying the result is often done badly, but is not that difficult. On the other hand, allowing some HTML formatting as user input, but disallowing “dangerous” parts is not an easy problem, especially if you are trying to foster an ecosystem of third party developers.

    The MySpace Samy worm worked not because MySpace failed to filter input, but because of a series of minor cracks that combined allowed arbitrary JavaScript. Once you choose to allow CSS so that users can add what passes for style on MySpace it becomes very hard to limit people to only visual effects.

    eBay has had less well known problems with a similar cause, but without a dramatic replicating worm implementation. Earlier this year scammers were placing large transparent divs over their listings so that any click on the page triggered a mailto or loaded a page of their own. I could not see examples today, so I assume they have fixed the specific vector, but giving users a great deal of freedom to format content that they upload makes ensuring that content is safe for others to view very difficult.

  • Stupidly long urls
    The big dogs love long complicated urls.

          https://chat.bankofamerica.com/hc/LPBofA2/?visitor=&mse
          ssionkey=&cmd=file&file=chatFrame&site=LPBofA2&channel=
          web&d=1185830684250&referrer=%28engage%29%20https%3A//s
          itekey.bankofamerica.com/sas/signon.do%3F%26detect%3D3&
          sessionkey=H6678674785673531985-3590509392420069059K351
          97612

    Having let people get used to that sort of garbage from sites that they should be able to trust, you can’t really be surprised that normal people can’t tell the difference between an XSS attack hidden in URL encoded JavaScript and a real, valid, safe URI. Even abnormal people who can decode a few common URL encodings in their heads are not really scrolling across the hidden nine tenths of the address bar to look at that lot.

  • Looking for simple solutions
    Security is not one simple problem, or even a set of simple problems, so looking for simple solutions such as the proposed .bank TLD is rarely helpful. This is not helped by the vendor-customer nature of much of the computer industry. The idea that you can write a check to somebody and a problem goes away is very compelling - buy a more expensive domain name, or a more expensive Extended Validation Certificate, or run an automated software scan to meet PCI compliance and you might sleep more soundly at night, but many users already don’t understand the URL and other clues that their browser provides them. Giving more subtle clues to them is unlikely to help. Displaying a GIF in the corner of your web page bragging about your safety might create the illusion of security and might well help sales, but it won’t actually help safety on its own.

You can’t follow the public example of the big dogs. They still make some dumb decisions, they still make the small mistakes that allow the CSRF and XSS exploits that are endemic and they are often not very responsive to disclosures. If a major site makes 99 good security decisions and one bad one, you won’t notice the 99. Unfortunately with security you are still far better off seeing how others have been exploited and critically evaluating what they say they should be doing, rather than trying to watch what they actually are doing.

Oh, and remember to stay away from malls on weekends in December.

On Open Source: PHP Video Podcast

October 20th, 2007

At OSCON this year Laura and I recorded a video podcast for Informit. This is part of Informit’s podcast series On Open Source.

Maybe I should have waited until I have had time to watch all of it and see if I want to encourage people to watch it, but here is in two parts anyway.
Part 1 and Part 2

We talk about books, Laura travails against frameworks, I talk about security, we talk about how we got into PHP, and I probably compare Java to something unpleasant.

Short, Clean URIs Are More Secure

July 31st, 2007

There are lots of reasons to use clean, short, readable URIs. Search engines like them. People have some hope of dictating or typing them correctly. Email clients are less likely to mung or truncate them. They give people navigational cues and an extra way to navigate a website. You can even fit them on one billboard (unlike say this one).

One generally ignored advantage is security.

Many phishing, XSS, CSRF and all URI exploits rely at least in part in part on putting stuff the user does not understand in the URI.

Here are a few real URIs from popular websites all found inside a minute within 3 clicks of the home page:

Having let people get used to that sort of garbage from sites that they should be able to trust, you can’t really be surprised that normal people can’t tell the difference between an XSS attack hidden in URL encoded JavaScript and a real, valid, safe URI. Even abnormal people who can decode a few common URL encodings in their heads are not really scrolling across the hidden nine tenths of the address bar to look at that lot.

It won’t help everybody. There are always going to be people who are happy to believe that their bank sends them email from a free address like bank.of.amerika@hotmail.com, and sufficiently sophisticated social engineering is always going to work on some people, some of the time, but the sites that are particularly popular with phishing attacks are making it unnecessarily easy.

If commonly used sites had short, sensible URIs it would not take genius on the part of slightly cynical users to notice that every real bank URI they had seen in the past looked something like https://www.bankofamerica.com/myaccount/login so the 300 character monstrosity full of percent symbols and ampersands that they were being presented with is a little on the fishy side.

Now, go and tidy your room.

OSCON 2007 Talk: Striving for Less Ugly Charts and Graphs From PHP

July 27th, 2007

Here are the slides for my talk today.

Striving for Less Ugly Charts and Graphs From PHP

My Proudest Achievement: A Downloadable Certificate from eBay

July 26th, 2007

eBay amuses me. They sent me a message the other day telling me that in recognition for my sterling efforts in buying other people’s junk and sometimes selling my own junk I could download a certificate. The message said “We’re cheering you on every day” and “We hope you’ll download your Turquoise Star Certificate and display it proudly.”

Presumably, there must be people out there who feel special when they get a form letter or the geniuses that populate big company marketing departments would not send them out all the time, right?

Here’s their message:

Congratulations! You’ve achieved a feedback rating of 100! With a Turquoise Star beside your user name, you are an active and well-established member of the eBay community.

We want to thank you for helping make eBay, The World’s Online Marketplace™, a safe and vibrant place to trade. Your success is our success. We’re cheering you on every day.

We hope you’ll download your Turquoise Star Certificate and display it proudly. You’ve certainly earned it! (You will need Adobe Acrobat Reader. If you don’t have it, get it here.)

Again, congratulations on your success, and keep shooting for the stars!

Meg Whitman
President and CEO, eBay Inc.

Here’s my reply:

Dear eBay,

I recently got a message in my eBay messages signed “Meg Whitman President and CEO, eBay Inc.” congratulating me on getting feedback rating of 100 and being given a turquoise star.

It said “We hope you’ll download your Turquoise Star Certificate and display it proudly.” Naturally, I was very pleased to see this. After all, it is not every day that the CEO of a major internet company personally sends me a message, and not every day I get a certificate to proudly display behind my desk.

Naturally, the first thing I did was bid on a certificate frame in an eBay auction so I would have somewhere to display it proudly as instructed. (Item number 200119977791)

However, after admiring it on my wall for a while I started having nagging doubts. I realised that the message from Meg (I hope she does not mind me calling her Meg, after all, she is sending me messages) does not include my name. It probably was not personally sent by her at all.

Worse yet, my certificate does not have my name on it either. If one of my coworkers steals it, they could easily pretend that they were awarded a Turquoise Star Achievement Award rather than me. Surely eBay has access to the kind of advanced technology required to insert a custom name into a PDF file?

My state of mind only went downhill from there. I realized that anybody can go to http://pages.ebay.com/awards/StarAwardTurquoise.pdf (the URL Meg kindly sent me) and print out a Turquoise Star Achievement Award of their own. The high esteem that my coworkers were holding me in because of my Turquoise Star Achievement Award could be diluted at any moment by somebody else printing an award they did not earn.

The final slap in the face was when I realized that just by guessing file names, I could download better awards.
http://pages.ebay.com/awards/StarAwardPurple.pdf
http://pages.ebay.com/awards/StarAwardGreen.pdf

How am I supposed to take pride in my award when I know that anybody else could simply print out a better one? My coworkers respect and admiration for me could evaporate instantly when somebody else figures out these URLs and prints a better Achievement Award than mine.

Do you think Meg would be happy if her MBA from Harvard Business School was suddenly rendered valueless by a link allowing anybody to print out a DBA from Harvard’s web site?

The seller of the certificate frame does not specify a return policy, so I don’t know if they will accept disillusionment with the award contained in the frame as a valid reason for a refund.

Luke Welling
Turquoise Star Achievement Award holder

Of course, eBay being eBay it is hard to tell if my message went to a person or to a very small script. I did get a reply. They promised to investigate whether the email really came from eBay or whether it was a phishing message.

And, of course USPS being USPS, the frame I ordered on eBay was smashed before it reached me.

Glory is such a fleeting thing.

Self Esteem and O’Reilly Animals

July 26th, 2007

Listening to James Reinders talk about Intel Open Sourcing their Threading Building Blocks got me thinking about O’Reilly animals.

James seemed kind of underwhelmed at being assigned a canary.

Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism

To be honest, I can see why. As mascots go, canaries are not an A-list animal. If half the other mascots would eat yours, and the other half could accidentally step on it and kill it, then you have not been well served.

Sure, there are only so many A-list animals to go around. It is not so surprising that the lions, tigers, elephants are already taken, but B-list can be fine too. Perl has adopted the camel with an enthusiasm far beyond what camels are used to. Hugh and Dave got a good one for their PHP and MySQL book. The platypus is a great animal for PHP. Sure, it looks like it was put together out of parts of other animals, but it is reasonably attractive, and has the kind of street cred you get from being poisonous.

But really, a canary? A scallop? A sand dollar? A moth? A beetle? It is hard to find glamour or prestige in mollusks and other invertebrates that that spend their short lives munching on decomposing waste.

I wonder if many of the people who get an invertebrate or a puny vertebrate ever write a second book for the same publisher, or if they quietly slink away and hide their book inside a Harry Potter dust jacket.

OSCON 2007 Tutorial: PHP and MySQL Best Practices

July 25th, 2007

Here are the slides for our talk today.

best_practices.pdf

If this site is slow, you can try http://www.laurathomson.com

Is computer Science Dead?

March 13th, 2007

Mainstream media are still keen to swallow the line that “real soon now” computer specialists will be redundant because fourth generation languages are so clever that clever people are not needed any more.

This fatuous pap by Neil McBride from De Montfort University (Rated by the Guardian’s University Guide as the 83rd best University in all of England) gives them the sound bites they need.

“Now vastly complex applications for businesses, for science and for leisure can be developed using sophisticated high-level tools and components.” he prattles. “Computer science curricula are old, stale and increasing irrelevant.”

Towards the end of his article it all becomes clear. “Here at De Montfort I run an ICT degree, which does not assume that programming is an essential skill. The degree focuses on delivering IT services in organisations, on taking a holistic view of computing in organisations, and on holistic thinking.”

I have never grasped the point of that kind of course. So you cater to people who want an IT career, but don’t have the core skills of the discipline? Why on earth do these people want to work in IT? Is there not some occupation they could find where they might be capable of grasping the essential skills?

He loves the car/software analogy. “Like cars, a limited number of people are interested in their construction, more live by supporting and maintaining them; most of us accept them as a black box, whose workings are of no interest but which confer status, freedom and convenience.”
Sure, the car industry needs many, many black box buyers, a moderate number of mechanics, a few engineers and designers, and very few theoretical purists. All industries, including computing do.

How many fresh graduates do you think the automotive industry need who take “a holistic view of” cars, but think understanding how an engine works is not “an essential skill”? Not very many I’ll wager.

The death of computer science is not just a fairy tale, it is also an enduring fairy tale. I am in the process of moving house, and cracked opened an old books on its way to the bin. Understanding Computer Science Advanced Concepts by Ray Bradley, Hutchinson Education, 1987 was a high school text book. He refers to the then current computers (late 1980s) as the fourth generation of computers. I don’t think that terminology has endured.

Under a heading “The Future” he writes “The development of the fifth generation machines promises to be the most significant yet. This is because of a fundamental re-think in the basic design of the machine. For example it should be possible to communicate with the machine in a natural language such as English. […] It should be possible for users to define their problems to the machine and for the machine to then develop the programs to solve them.”

That is not exactly how I recall computing in the 1990s panning out.

The death of computer science was a fairy tale in 1987, and 20 years later it is still a fairy tale. More powerful computers are not replacing programmers any more than calculators are replacing accountants or power tools are replacing carpenters.

What is considered a hard problem in computing changes over time but each era still has its hard problems that need smart people with a deep understanding of the fundamentals to solve.

Neil McBride

I ♥ register_globals

March 13th, 2007

I am aware that there are some things so shocking that you are not supposed to say them in polite company “Hitler had some good ideas”, “Tori Spelling is really pretty” or “I think I look really good in a beret” are all ideas so confronting that they are best kept to yourself regardless of how strongly you believe them.

I have a similarly shocking sentiment that I feel I have to share.

I really like register_globals in PHP.

There, I’ve said it. I can go away and order my I register_globals shirt now.

I (heart) register_globals

Sure, choosing to mingle untrusted user data and internal variables is a bad idea. Sure, if you are too lazy to initialise important variables with a starting value it gives you one extra way to shoot yourself in the foot. Sure, polluting global scope with form variables is going to be a mess in a larger app.

There remains something to be said for simple, elegant, readable ways to shoot yourself in the foot. PHP, like any reasonably complete programming language provides a whole host of other ways, so removing one is not particularly useful.

I used to teach PHP to beginners as a first programming language. I have introduced a few thousand complete novices to programming via PHP.

With register_globals on, this example is a short step from the “Hello World!” example:

<?php
if($name)
{
 echo "Hello $name";
}
else
{
 echo
  '<form>
   Enter your name: <input type="text" name="name">
   <input type="submit">
  </form>';
}
?>

It flows nicely from a “Hello World!” example. It can introduce variables and control structure if you did not provide an even softer introduction to them. It can be turned into an example with a practical use without making the code more complex.

This version may not look very different to you:

<?php
if($_REQUEST['name'])
{
 echo "Hello {$_REQUEST['name']}";
}
else
{
 echo
  '<form>
   Enter your name: <input type="text" name="name">
   <input type="submit">
  </form>';
}
?>

To an experienced eye, the two versions are almost identical. The second requires a little more typing, but nothing to get excited over.

To a complete beginner though, the second is a couple of large leaps away from the first. To understand the second version, somebody has to understand arrays, and PHP string interpolation. Both of these are important topics that they will have to come to in their first few hours of programming, but without register_globals, they stand in the way of even the most trivial dynamic examples.

I miss being able to assume register_globals as default behaviour. It made the initial learning curve far less steep. It made little examples cleaner and more readable. Like most safety measures, it does not really protect people who are determined to get themselves into trouble anyway. People who don’t understand the reasons behind it just run extract() or some code of their own to pull incoming variables out anyway. The user submitted comments in the manual used to be full of sample code for doing exactly that.

Oh, but just a side note to all beret wearing white supremacist Tori spelling fans, just because I am willing to speak up for one unpopular cause does not mean I am interested in yours. Sorry.

Melbourne PHP Users’ Group - March 8th

March 7th, 2007

On Thursday, I will be speaking at PHP Melbourne. My talk is titled PHP Considered Harmful. In case you are wondering though, it does not mean I have had a falling out with PHP. I have spent 10 years talking about what’s great in PHP and I need to vent occasionally. Come along if you are nearby. If not, and I am not strung up by an angry mob, I might redo the talk in another hemisphere later in the year.

The other speaker is Chris Burgess on Building Secure Web Applications.

His blurb:

This presentation expands on a presentation given at the Open Source Developers’ Conference in December 2006 titled “Web Application Security - Tools, Techniques, Tips and Tricks”. I will explore some of the original material for those who were unable to attend, taking a look at the plethora of Open Source tools that can greatly assist developers and testers of web applications. In addition to this, I will discuss techniques that can be used to harden web applications.