Key Derivation
Password-Based
「 Overview 」
Recommended 「 In Order 」
Argon2id
64+ MiB of RAM
3+ Iterations
1+ Parallelism
Winner of the Password Hashing Competition in 2015, widely used and
recommended now, and very Easy To Use in libraries like Libsodium.
Use as high of a memory size as possible and then as many iterations
as possible to reach a suitable delay for your use case, such as:
- A delay of 500ms for server authentication
- 3 - 5 seconds for disk encryption
- 1s for file encryption
Scrypt
N = 32768
r = 8
p = 1+
The parameters are more confusing and less scalable
than Argon2, and it’s susceptible to Cache Timing Attack .
However, it’s still a Strong Algorithm when configured correctly.
Bcrypt
12+ Work Factor
Note that this is not a KDF because the output length cannot be adjusted.
Only use this for password hashing when none of the better algorithms are available
It’s better than PBKDF2 in terms of resisting GPU/ASIC attacks,
except for long passwords, but trickier to implement correctly
and worse than Argon2 and Scrypt in that it requires much
less memory, with the amount of memory being fixed
rather than adjustable.
Unfortunately, it only uses the first 55 characters of a password and has a
stupid password length limit of 72 characters, meaning people often prehash
the password using something like SHA2 to support longer passwords.
However, this can lead to password shucking when using a weak hash function
(MD5, which should never be used for anything anyway) and null bytes in the
hash allowing an attacker to find collisions, speeding up attacks.
Therefore, you should Base64 encode the prehash before passing it to Bcrypt.
PBKDF2-SHA512
120,000+ iterations
Only use this when none of the better algorithms are available
or due to compatibility restraints
Because it can be efficiently bruteforced using GPUs
and ASICs when not using a high iteration count.
Note that it’s generally recommended not to ask for more than the output
length of the underlying hash function because this can lead to attacks.
Instead, if that’s required, use PBKDF2 first to get the output length of the
underlying hash function (64 bytes with PBKDF2-SHA512) before calling a
non-password-based KDF, like HKDF-Expand, with the PBKDF2 output as
the input keying material (IKM) to derive more output.
Avoid 「 Unordered | All Unsuitable 」
Storing Passwords In Plaintext
This is a recipe for disaster.
If your password database is ever compromised, all your users are screwed,
and your reputation in terms of security will go down the drain as well.
Using Passwords As Keys
Key = Encoding.UTF8.GetBytes( Password )
Firstly, passwords are low in entropy, whereas cryptographic keys need to be high in entropy.
Secondly, not using a password-based KDF with a random salt means attackers can quickly
bruteforce passwords and users using the same password will end up using the same key.
Using Regular / Fast hashes
These are not suitable for password hashing
because they’re not slow, which allows for fast bruteforce attacks.
Password hashing also requires using a salt to protect
against attacks using precomputed hashes and to prevent
the same password always having the same hash.
However, adding a salt to certain regular hash functions, such as SHA2, can
lead to Length Extension Attack , as discussed in point 3 of Hashing notes.
Encrypting Passwords
Encryption is reversible, whereas hashing is not
If an attacker compromises a password database and obtains a password
hash, then they don’t know the password without computing the hash.
By contrast, if an attacker compromises a password database and the relevant
encryption key(s), then they can easily obtain the plaintext passwords.
Encryption would also reveal the password length unless you padded the input.
PBKDF1
Never use this
- As it was superseded by PBKDF2
- And can only derive keys up to 160-bits,
which is basically not suitable for anything.
Some implementations, such as PasswordDeriveBytes() in C#, are also completely broken.
SHAcrypt
- It’s weaker than the recommended algorithms
- Nobody uses this
- And I’ve never even seen it in a cryptographic library.
PBKDF2 - ( MD5 | SHA1 | SHA256 | SHA384 )
Use SHA512 if you must use PBKDF2
MD5 and SHA1 are old hash functions that should not be used anymore.
Then PBKDF2-SHA256 and PBKDF2-SHA384 require significantly more iterations
than PBKDF2-SHA512 to be secure and have a smaller block size, meaning long
passwords may get prehashed.
Argon2i
Iterations < 3
Unlike Argon2id and Argon2d, Argon2i has been attacked,
with 3+ iterations being required for the attack to not be efficient
and 11+ iterations being required for the attack to completely fail.
Argon2i is also weaker than both Argon2id and Argon2d
when it comes to resistance against GPU / ASIC cracking.
Therefore, as per the RFC, Argon2id should be used if you do not know the
difference between the types or you consider side-channel attacks to be a
viable threat because Argon2id offers the benefits of both Argon2d
(GPU / ASIC resistance) and Argon2i (side-channel resistance, albeit to a lesser extent).
Chained Hashing
Scrypt( PBKDF2( Password ) )
This just reduces the strength of the stronger algorithm
since it means having worse parameters to get the same total delay.
Notes
「 1 」
Never hard-code passwords into source code
These can be easily retrieved.
「 2 」
Always use a random 128-bit or 256-bit salt
Salts ensure that each password hash is different, which prevents an attacker
from identifying two identical passwords without cracking the hashes.
Moreover, salting defends against attacks that rely on precomputed hashes.
The typical salt size is 128-bits, but 256-bit is also fine for further reassurance that
the salt won’t repeat. Anything above that is excessive, and short salts can lead to
salt reuse and allow for precomputed attacks, which defeats the point of salting.
「 3 」
Always use the highest parameters / delay you can afford
Ideally, use a delay of 250+ milliseconds.
In many cases, that’s too small.
For instance, PBKDF2 requires a high number of iterations because it’s not resistant
to GPU / ASIC attacks, and if you’re performing a non-interactive operation
(disk encryption), then you can afford longer delays like 3 - 5 seconds.
「 4 」
Avoid string password variables
Strings are immutable in many programming languages such as C#,
Java, JavaScript and Go, and thus can’t be zeroed out from memory.
Instead, use a char array if possible and convert that into a byte
array for password hashing / password-based key derivation.
Then erase both arrays from memory after you’ve finished using them.
Note that this is also difficult in many programming languages, as
explained in point 7 of the Symmetric Encryption notes, but attempting
to erase sensitive data from memory is better than doing nothing.
「 5 」
Compare passwords in constant time
If you ever need to compare passwords (for password re-entry in a console application),
then you should use a constant time comparison function to prevent timing attacks.
Sometimes these functions require both arrays to be equal in length, in which case you can
hash both passwords using a regular hash function (BLAKE2b-512) for the comparison.
Just erase these values from memory afterwards and don’t use them for anything else.
「 6 」
Use a 256-bit and above output length
For password storage, a 128-bit hash is normally fine, but a 256-bit
output provides a better security level for high entropy passwords.
For key derivation, you should derive at least a 256-bit output and
perhaps more, depending on whether you need to derive multiple
keys (a 256-bit encryption key and a 512-bit MAC key).
「 7 」
Always store the parameters with the password hash
Such as Memory Size
, Iterations
and Parallelism
for Argon2
These values don’t need to be secret and are required to derive the correct hash.
When storing passwords in a database, you should store these values for each user in order
to verify the hashes and transition to stronger parameters over time as hardware improves.
In some cryptographic libraries, this is done for you.
By contrast, in a key derivation scenario, you can get away with using
fixed parameters based on a version number stored as a header.
File Format v3
= 256 MiB of RAM
+ 12 Iterations
Then if you want to change the parameters, you just increment the version number.
「 8 」
Perform Client-side Password Prehashing
For server relief or to hide the plaintext password from the server:
When creating an account, the server can send a random salt to the
client that’s used to perform password hashing on the client’s device.
The server then performs server-side password hashing
on the transmitted password hash using the same salt.
Then the salt and final password hash are stored in the password database.
When logging in, the server sends the stored salt to the client, the client performs
client-side password hashing, the client transmits the password hash to the server,
the server performs server-side password hashing using the stored salt, and then
the server compares the result with the password hash stored in the database.
In the event of a non-existent user, the salt that’s sent should
always be the same for a given username, which involves using
a MAC (keyed BLAKE2b-512), with the username as the message.
「 9 」
Don’t use padding to hide the length of a password when sending it to a server
Instead, perform client-side password hashing if possible (please see point 8 above).
If that’s not possible, then you should hash the password using a regular
hash function, with the largest possible output length (BLAKE2b-512), on
the client’s device, transmit the hash to the server, and perform server-side
password hashing, using the transmitted hash as the password.
Both techniques ensure that the amount of data transmitted is constant and
prevent the server effortlessly obtaining a copy of the password, but client-side
password prehashing should be preferred as it allows for more secure password
hashing parameters and provides additional security compared to if the server
leaks / stores the client-side regular / fast hash of the password.
「 10 」
Use rate limiting to prevent denial of service (DOS) and bruteforce attacks
This involves blacklisting certain IP addresses and usernames from
trying to log in temporarily to prevent the server being overwhelmed
and to prevent attackers from bruteforcing passwords.
「 11 」
If a user can supply very long passwords, then this can lead to denial of service attacks
This happened to Django in 2013.
To fix this, either enforce a password length limit (128 characters is the max)
or prehash passwords using a regular / fast hashing algorithm, with the highest
possible output length (BLAKE2b-512), before performing password hashing.
「 12 」
Hash-then-Encrypt for additional security when storing passwords
You can use a password hashing algorithm on the password before
encrypting the salt and password hash using an AEAD or Encrypt-then-MAC,
with a secret key stored separately from the password database.
This forces an attacker to decrypt the password hashes before trying to crack them.
Furthermore, it means that if your secret key is ever compromised but the
password hashes are not, then you can decrypt all the stored password
hashes and re-encrypt them using a new key, which is easier than resetting
every user’s password in the event of a pepper being compromised.
「 13 」
Use a pepper for additional security when deriving keys
A pepper is essentially a secret key that’s mixed with the password using a MAC
HMAC-SHA512( Message : Password , Key : Pepper )
before password hashing.
In the case of password storage, using Hash ➜ Encrypt
makes more sense for the reason I explained above.
By contrast, for key derivation, using a pepper is a great idea if possible because
it means an additional secret is required, making a bruteforce more difficult.
For instance, a keyfile in File / Disk encryption software acts as a pepper,
which improves the security of the key derivation assuming that the keyfile
is stored correctly (on an encrypted memory stick away from the encrypted file / disk).