15 November, 2012

Building applications that require users to log in

Just posted this as a comment on a LinkedIn thread:

Let me try to clear things a bit by giving an example:

I can make an application that requires users to log in. I will use hashing for checking passwords, that is, I will only store the last character of whatever password the user gave me during registration and then compare the last character of the password to the stored "hash". So, if Alice chooses the password "love" and Bob chooses "hate" I will store the same "hash value" - "e" - for both users. This is called a hash collision and may be the weak spot of this design - all Eve needs to do to log in as either Alice or Bob is to try at most 256 one-character passwords (assuming single-byte charset here for simplicity). That is, Eve can find collisions with a complexity of 28
Mind you, finding a hash collision is not the same as decrypting an encrypted password - Eve still doesn't know the actual passwords, so she cannot login as Alice or Bob into other applications that use other hash functions, even if Alice and Bob used the same passwords there.
Now, my application obviously has a security problem - it is too easy to log in as a different user. But the only thing I need to do to secure it is to use a better hash function - SHA-1 for example - and tell Alice and Bob to set their passwords again, so I can store the new hash values. Now Eve's task of finding collisions is much more complex - 261. She probably has better chance of guessing the passwords using a dictionary instead of using a brute force approach - that is, unless Alice and Bob choose sufficiently strong passwords (random strings of sufficient length).
As you see, there is no encryption involved, there is no key or other secret to store securely. The security of the application hinges on the complexity of the passwords and the quality of the hash function. And a hash function that has withstood the attacks of the world's best cryptographers (to a degree - SHA-1's 261 complexity of collision finding is considered too low for critical applications - see the Wikipedia article) is most likely better than any function I can come up with.

Just a final note: for my application, I might find it inconvenient to store the result of SHA-1 directly in the database, because it is a large number and I'd have to use a column of type RAW. So I might decide to convert it to a string and use VARCHAR2. But now I have to worry about character set conversions between client and server, so I cannot just cast the raw result to VARCHAR2. I have to encode it somehow. One way would be to convert the raw (base256) number to a string of hexadecimal digits (base16), but that is quite wasteful: I'd need two bytes of storage for every byte of the original. So I might decide to use the base64 encoding instead, since it is widely used on the Internet and is supported by just about any server or client programming environment, and it has 64 "digits" instead of hex's 16.

I hope this clears the situation a bit and there will be no more offers to decrypt a hash value.