Friday, 5 July 2013

Designed to Confuse

Font of All Knowledge

It's a play on words, I don't mean "fount", and it's not about wisdom, rather the lack of. What I want to talk about is confusion within computer fonts... yeah I've found another problem that nobody else even considers to be one. Generally computers have brought beautiful typography to the masses. You try and draw letters on a poster without it, or use a typewriter to bash out a page of text, centering your titles and without making any mistakes. We have oodles of different font styles that we can use and (often) abuse, but strangely we tend to just stick to one or two.

The Arial font displayed in Windows Character Map.
Let's look for a moment at the characters in Arial, they look clean and familiar don't they. It's a hugely useful font designed in 1982 as a generic and bland version of a sans serif font. (Yeah somebody thought standard Sans Serif was too flashy!) Microsoft started shipping truetype versions of it in Windows 3.1 back in 1992, and it first appeared on the Mac when Apple released OSX in 2001.

Sans Serif means it doesn't have the small projecting features (called serifs) at the end of the strokes. Times New Roman is probably the most well known example with serifs.
Serifs shown here in red.
Undoubtedly the serifs looks grand, and they help differentiate some of the similar characters, but they have the disadvantage of being slightly less easy to read. They tend not to be used in large bodies of text, so fonts like Arial have become very popular, almost the de-facto standard because mostly people want clear readable text. But here's the rub, without the serifs some of the letters look the same.

 So What's the Problem?

It all came to a head recently when I was sent a new server password at work that had been created to fall in line with a rigid password policy. The policy states (something like) don't use real words, and include a mix of capitals numbers and characters. That's nothing unreasonable until you understand which characters had been selected to fit in with this policy.

The password was based on a meaningful term, in this instance system name and company abbreviation, as we outsource our IT hardware support to HP. The system was called TAMS and the company abbreviation is PLS, so the HP system administrator had unthoughtfully set the password to "T@msPl5"

OK, if you didn't know (& I didn't) what the password was based on then you might not be sure which letter was after the 'P'. "Big deal, just try again," you might say, but what if your policy is set to lock the account after three failed attempts? Non word passwords are difficult to type at the best of times.

I've had similar issues in the past when fellow developers had been slightly foolish in the selection of variables or labels in their code. One guy habitually used 'i' (for iteration) within his function loops, which was OK until he switched to using capital letters.

Still not 100% sure what I'm on about?.. well certain letters and numbers in fonts are either difficult or impossible to distinguish from one another. Look at the table showing comparison of some well used fonts.

Problem characters displayed using common fonts.

All of a sudden our Arial doesn't look so good. It tends to be mainly restricted to the fonts without serifs, and normally it isn't a problem because context helps us out. Where it becomes a problem is when you need to exactly identify the character when context is missing. Examples of this might be:
  • Passwords - where a mixed character and uppercase policy is involved.
  • Computer code or script - where single letter variable names are used.
  • Serial Numbers - especially where the font size is tiny.
It seems that the problem is rooted in the past. Back in the days of moveable type you could reduce complexity by reusing the lower case L character for a 1. Indeed this practice was carried through to early typewriters which would typically exclude the number 1 from their keyboards. And fonts are created by designers, who care more about how something looks that whether it might cause the odd confusion between similar letters.

Avoiding The Problem

In years gone by when Car Registration plates in the UK used a single letter as a year suffix, certain letters where avoided. The letters I, O, Q, U and Z were never issued as year identifiers.
  • I because of its similarity to the numeral 1
  • O because of its identical appearance to a zero
  • U because of similarity to the letter V
  • Z because of similarity to the numeral 2

So back in years 1963 to 1983 they were aware of the problem and took proactive steps to reduce confusion and to assist with the correct identification on registration plates.So it's not rocket science, the boys at the ministry have been doing it for years.

Useful Methods

Not all solutions to problems have to be technical in nature, the common way we cope with such problems is to use comparison:
  • Are there similar letters you can compare against, or similar examples that help identify the character you're unsure of?
  • Are there any coding standards that prohibit certain characters?

Also you could consider the following:
  • If you generate any codes, simply avoid any characters that might confuse?
  • Start using a font like MS Trebuchet which retains character differentiation without resorting to serifs.

A Dim Outlook

I started looking into altering the font in my company email, but it only effects the outgoing emails that you create. Incoming emails are sent with the font pre-selected in what Microsoft calls the stationary, so it's not possible to change them. I found a tantalising stationary override tick-box in the setup page, but that disappointed me by doing nothing.

So I'm still stuck with copying and pasting before I can alter the font. Windows used to be all about control and configuration, what went wrong? I can't switch to another email client, it has to be outlook, yet it seems rigid in it's ability

Unless you know the answer!


  1. I don't know the answer, but now I'm going to hack into your work server, thanks!

    I sometimes write a zero with a slash as our early computers used to do, makes a lot of sense to me and I'd like to see that across the board. But as your image shows, O's are usually round and zeros thinner so I don't think there's much of a problem there. There doesn't seem to be an easy answer with the l and I problem if you're using a sans serif font.

  2. A-hahh,.. you fell into my trap, but even if you did know which of the hundreds of servers it was, and could get past the firewalls I didnt use the actual password. It didn't seem the right thing to do!

    I also slash my zeros, my zeds (or zee's if you're American and pronounce it incorrectly) and my sevens for similar reasons. And I add the top serif when I write a one, unless I have to do a lot of them to do (... and then the slight OCD in me forces me to go back and add them).

    See I told you it wasn't just on computers.