technology – Mark Shead

For the past 13 years my primary office has been in my home. During this time, I’ve experimented with a lot of different things from simple small desks in Mexico to large glass topped workspaces. Here is what my current desk looks like.

The first thing people usually notice is the weird lights. I work on video conference pretty much 8 hours a day right now. The lights help make sure that the video conferencing is as effective as possible. The studio lights were about $150 and give nice even lighting for the video as well as for the work surface. Each stand has two 80 watt CFL bulbs which are supposed to be equivalent to about 300W in a traditional light.

The monitor on the right is hooked to a small Windows 8 computer that is running the video conference software. You’ll have to click on the image to see it better, but there is a small Cisco video camera on the monitor.

The computer on the left hand side is running Linux and the laptop hooked to the center screen is my MacBook Air. If you notice, there is only a single keyboard. I use a program called Synergy to share between the screens. My MacBook Air is the server and lets me move the mouse and keyboard to the screens on either side. This may sound like a small thing, but it is a huge improvement to be able to treat multiple computers as one from your input device.

The desk is a Turnstone Bivi. My favorite thing about the desk is that it has lots of places to hide wires. The top half of the work surface is a tray that I use to put my power supplies and cables and it gets many of the wires out of the way. Also the desk has 16 outlets built into it–and I’m using all of them. I have the Bivi double so there are actually two work surfaces facing each other. Last summer when I had an assistant helping me she used the opposite desk, but now I use it for my printer or if I need some additional room to spread things out.

This picture is looking from my desk backwards. The desk to the side is my old glass top desk and there is another one on the other side as well. I’ll sometimes use these for organizing paper work but often they start collecting junk so I may replace them with some shelves or storage units.

Some time ago I made a comment on Lifehacker.com. In order to comment I had to create a login. I used an email account that I mainly use for junk email and a simple password. In the past week, some hackers got into Gawker’s servers and downloaded among other things the user database.

When you store a password in a database, you are supposed to hash it. In theory this turns your password into a random string so no one who has access to the database, can actually see the password. To verify your login, the server takes the password you give it and runs it through the same hash software. If it matches what is stored in the database then you are given access.

The hash function is one way. You can easily create the hashed value from the password, but it is difficult or impossible to create the password from the hashed value. However, if you have a list of the hashes from a bunch of passwords, it is easy to search for a hash and then find its corresponding password.

To prove how great they were, the hackers did this to nearly 200,000 accounts and posted it online. These accounts were listed with their username, password and email address. Mine was one of them. Fortunately, the password is something I haven’t used for several years and I haven’t used it for anything important for about a decade. Still it is a bit unnerving to see my password out there for everyone to read.

I traced down a few places where the password was still being used–on a bookmark site and another comment account and changed those. Fortunately, I have been using 1Password for a few years so it was easy to do a search and find the accounts that were still using this password.

More recently I have been using 1Password to create a random password for each site I visit. This is ideal because a long random password is going to be much more difficult to search for unless they have an enumerated list of every possible combination of keyboard characters of any length. Further, even if someone was able to find the random password that corresponds to the hash, it would only give them access to the account on one website–which they probably had to have access to in order to get to the database in the first place.

Salting Passwords

One thing Gawker/Lifehacker should have done is “salted” the passwords. This would simply be adding some characters to the beginning or end of each password before running it through the hash function. When you go to login again, the server simply takes the password you give and adds the same characters to it before running the hash on it again. This makes the hashes much more difficult to find because it helps make sure that they aren’t common words. If the hackers don’t know what the salt is or how it is applied, this should make it pretty much impossible to figure out the passwords.

However, in this case, the hackers got access to the Gawker source code as well, so they would have known exactly how any salt was applied before making the hash. The hackers could go through and create hashes for a bunch of common words with the salt characters added to it, but that means they can’t rely on existing databases of hash to password mappings–they would have to calculate everything from scratch. Still this is possible and not unreasonable–particularly because they would only need to run through a bunch of words once in order to see if any user had used that as a password.

So if you have 1,000,000 users, running such a process against 1,000,000 common passwords will probably give you access to a number of the accounts. That would require 1,000,000 hash operations using the dictionary file. I did a test and found that 1,000,000 MD5 hashes takes about 25 minutes on my i7 MacBook Pro. This is just using the command line and a shell script. I’m guessing that it could be done much faster using a different method, running it on a bigger machine or dividing it up among different computers.

Things can be made much more secure by setting things up so the password to hash mappings can’t be used to search the whole dataset. You want the hackers to need to recalculate the hashes for the entire dictionary file for each user. This can be done if each password in hashed using a different salt value for each user’s password.

Different Salt Values

This could be as simple as merging the username and passwords together and then creating the password hash from that. This means that in order to compromise the same number of accounts as could be done with the same salt value would now require 1,000,000 X 1,000,000 hash operations. This would mean significantly more time and/or more machines to run it against all of the passwords. It would take my computer somewhere in the 50 year range to do this. Still if you got a cluster of 100 computers that are all 10x faster than my laptop, it could be done in around 400 hours. So it is definitely doable, but it is starting to get much more expensive. Amazon’s high end EC2 instances cost $0.50 to $2.10 per hour so we are looking at costs in the $20,000 or higher range.

Still this isn’t prohibitive–particularly if the hackers are willing to use a smaller dictionary of passwords. With at 100,000 dictionary of potential passwords the cost on Amazon falls into the $2,000 or higher range.

Hashing the Username or Email Address

Another idea would be to hash the username or email address. Usernames are going to be a bit more difficult to attack using a dictionary because they tend to be more unique. In fact, if you were to combine the username and the password into the same hash it would be significantly more difficult because you couldn’t try to break the username or password by itself. So something like:

hash(username + password)

Of course this would only work if you didn’t need the username anywhere else. For example, you couldn’t use it to show the name of the person who left a comment. This might work out ok if your application uses a username to login and a “screen name” to show to other users.

The problem with this is that you couldn’t look up a user without having their correct username and password. This could be problematic for doing any type of password recovery. You might be able to work around this by using a third field that can uniquely identify the user–like an email address.

If your system doesn’t need to send emails, hashing the email address might be a better solution because these tend to be much longer and are automatically unique. You could still send emails for password recovery because when given an email address you could find the row with a matching hash value.

If the average email address is 20 characters and can contain A-Z, @ and a period, then you’d be looking at 20^28 different combinations for a dictionary attack. Actually the numbers would be significantly reduced if you took into consideration the fact that most email addresses end in .com, there is only one @ and other common patterns. This is partially offset by the fact that some email addresses are longer than 20 characters and the fact that many people have username portions of their email that are more complicated than their passwords.

Something like the hash shown below would be impossible to break on any large scale, but wouldn’t allow traditional password recovery using email.

hash(email address + password)

If someone forgot their password, it would be impossible to lookup their account via email using this approach. Still, in some situations that might be acceptable–particularly if an alternative form of password recovery could be done either in person, SMS or some other method.

Category: technology

Mark Shead’s Home Office

Tapestry 5.2.4 Out

Keep From Giving Away Passwords Like Gawker

Salting Passwords

Different Salt Values

Hashing the Username or Email Address