Remembering dozens of passwords

You’ll never forget your password ever again

In recent weeks, there have been claims that username/passwords of Dropbox have been leaked online. While Dropbox has denied that any passwords were leaked, their advice was for “users not to reuse passwords across services”. For people who don’t use second-factor authentication or password manager services, this is good advice.

In fact, I’ve moved away from the approach I described previously of how to choose a strong password. There is no such thing as a strong password once it’s leaked. Sadly, even well regarded sites like Evernote and LinkedIn have had their passwords stolen, and no service can be considered immune to hacks.

Previously, I simply remembered passwords relating to different tiers of service: a password for my most secure service, another for secure but less important services, another for services I use regularly but don’t need to be secure, and another for services that I don’t really use. This way I just needed to remember a handful of passwords across many sites. Unfortunately, this method is not proof against hacks.

However, to remember a different password for every site is infeasible for most people (including me!). Still, there is a way to have a large number of different passwords across different sites but need to remember only two things: a password stub and a password algorithm. When logging in, a user just needs to apply the name of the service and the stub to the algorithm, and out should pop a (relatively) unique password. Different stubs might be used for different accounts, e.g. if the same service is used for both work and personal purposes.

Here’s an example of how this might be used. Take the password stub “pa55word” and the algorithm “insert the second and third letter of the site name in the third position”, then if this user was logging in to “www.dropbox.com”, the second and third letter would be “ro” and the unique password would be “paro55word”. (Let me just say that this is neither a stub that I use nor an algorithm, and now that it’s documented here, not one that you should use either.)

Since there are potentially 676 (26 x 26) combinations of second and third letters, this algorithm can generate hundreds of passwords without needing to remember more than two things. It’s easier than my previous approach where I needed to remember at least four things.

In choosing a stub, it’s helpful to include the sorts of things that password strength tests look for, e.g. some punctuation, a number and both upper and lower case letters. In choosing an algorithm, you want it to be pretty simple so that it will work for many different site names, so don’t go overboard.

So this will let you follow Dropbox’s advice, and avoid you reusing passwords, but when (!) a service has its passwords hacked and you need to change the password, it’s not going to work. So, probably you need to remember a third thing – how many times a given service has been hacked (hopefully there aren’t too many). Then you would have a modification of the algorithm that would incorporate this information as well, e.g. have as the letters inserted for the second iteration of a password on www.dropbox.com to be “rro” instead of “ro”, and the third iteration being “rrro”, etc. This does expose the main weakness of the method, in my opinion, so I’m hopeful of coming across a better approach at some point.

As I mentioned at the top, second-factor authentication and password manager services are also approaches that can be considered, but have their own downsides. I’m more hopeful that these services will improve in usability and utility over time so that I can make more use of them, before I need to remember the details of too many website hacks.

Password Strength Misguided

When I sign up to a new website, there’s typically a “password strength” indicator on the page where I submit a login name and password. Usually to get a strong password score, I need to have the password be at least six characters long, include both upper and lower case, and often a number or punctuation somewhere in there, too.

For passwords that I have used at work, this sort of scoring is used, and in addition, a strong password is considered to be one that hasn’t been used for too long (say, isn’t older than 3 months) and isn’t one that’s been in use before (say, within the last 3 years). This is all “hard-wired” into the password change system so that it is difficult to avoid.

However, it looks like mainstream IT media is now acknowledging that these concepts of password strength are misguided, and lead to passwords that either need to be written down somewhere (because they are too hard to remember) or are trivial manipulations of common words to make them comply with the policies (which make them easy for hackers to discover using computer software). Wired Magazine published an article on 13th January describing this problem and suggesting that finally research is being done to come up with passwords and policies that really are secure.

While normally “easy to use” and “secure” are attributes that necessarily lie at opposite ends of the design spectrum, when it comes to passwords, they aren’t too far apart. An easy password is a memorable password, and a memorable password is more secure because it doesn’t need to be written down (or even kept inside a password manager, such as LastPass or KeePass).

There’s a great comic from xkcd that covers that point. It suggests that simply using four common words strung together is both more memorable for people and harder for computer software to crack than typical complex passwords. The analysis used is to consider how many possible combinations exist that computer software would have to try before striking upon the correct password – entropy (measured in units of bits) is higher when more possible combinations exist.

Using this approach, 26 different possibilities (one for each letter) has 4.7 bits of entropy, and 70 different possibilities (lowercase letters, uppercase letters, numerical digits plus four common punctuation symbols) has 6.1 bits of entropy. A password made up of six characters with each of 70 possibilities has six times 6.1 bits of entropy, for a total of ~37 bits.

However, 5,000 different possibilities (one for each of the 5,000 most common words in English) has 12.3 bits of entropy. A password made up of four such words (even if all in lower-case, without any punctuation) has ~49 bits of entropy, which takes over 5,000 times as long for computer software to crack. In fact, just using three such words gets you ~37 bits, for equivalent security.

One problem with this approach is that many password systems have a maximum length, say of 12 characters. It’s not clear that imposing such a short limit increases security, but regardless, many systems do this. Four words strung together are likely to exceed 12 characters, making these passwords impractical on such a system. I wondered if there was some way to retain the spirit of this approach but fit within 12 characters.

I downloaded a list that claimed to be the 5,000 most common words from www.freevocabulary.com (it turned out to have 5,010 unique words) and did some tests on it. If you use the first three letters from words on this list, there are 1,103 different possibilities, which has an entropy of 10.1 bits. Putting four of these three-letter prefixes together would give you an entropy of ~40 bits, which isn’t too bad.

So, while I’m no password security expert, it does appear that you could use a “random four words” approach for most sites, and fall-back to just the three-letter prefixes of those words when a site has a maximum password length that’s too short for the normal password. In any case, this suggests that there is fertile ground for research into passwords that are both memorable and secure.

However, I know that even while such passwords are more secure than the typical complex password, unfortunately they still won’t be accepted when I try to register them at new websites. They’ll fail on the password strength indicators! Sadly, this is a case where both ease of use and security are being let down.

The End of Protection

New drives for notebooks roll off of factory lines
Image via Wikipedia

I use and believe in the value of anti-virus software to protect my PC against malware. However, it appears that the full level of protection will soon come to an end, if it hasn’t already.

My home computer of choice is a laptop. It’s not by any means a highly performant, always-on, always-connected server. When I need to use it, I power it up, do what I need to do, then power it down. Mostly I use the web browser – it doesn’t need a whole heap of grunt.

More people are turning to these as their preferred computers. Laptops now outsell desktops. Netbooks are expected to sell like hotcakes.

Unfortunately, the following facts don’t seem to paint a pretty picture for me:

  • During the week, I use the computer for at most a couple of hours per day.
  • The virus scanner takes a couple of hours to run.
  • By default, the scanner does a complete computer scan every day (a practice recommended elsewhere).
  • Over time, I will have more disk to scan (e.g. you can buy about twice the size hard disk for the same money each year).
  • Over time, I will have more files to scan (e.g. browser caches will contain more since more objects appear on each web page every year and HTML5 techniques involve storing data locally).

It has gotten to the point where I turn off the computer prior to the virus scan finishing. The virus scan effectively never completes, so at no point can it assure me that the computer is free of malware.

I can see some solutions to this. None of them are ideal.

Firstly, I will have to give up on daily scans. If it never gets to finish anyway, then why should I pay the price for the massive slow-down that I get from constant scanning?

I could also set the browser to delete all files in the cache when I exit (or at least delete them on a regular basis). However, I suspect most browsers lack this feature today.

Finally, I could use a Mac or a Linux PC instead. Since there is less malware for those platforms, scanning should be much faster.

Reblog this post [with Zemanta]