Skip to the content.

Symmetric Encryption

Overview





XChaCha20BLAKE2b

Encrypt ➜ MAC

If you know what you are doing, then implementing Encrypt ➜ MAC offers better security than
an AEAD because it provides better security properties, such as key commitment, and allows for a
longer authentication tag, making it more suitable for long-term storage.

This combo is now being employed by PASETO, an alternative
to JWT, as well as my file encryption software called Kryptor.

ChaCha20 has a higher security margin than AES whilst also being fast in software
and constant time, meaning it’s not vulnerable to timing attacks like AES can be.

Moreover, Salsa20, the cipher ChaCha20 was based on, underwent rigorous
analysis as part of the eSTREAM competition, and both ChaCha20 and Salsa20
have also received further analysis since then.


XChaCha20-Poly1305

This is the gold standard for when you don’t know how to implement
Encrypt ➜ MAC or need maximum performance on all devices.

As mentioned above, ChaCha20 has a higher security margin than AES, always runs in constant
time, and (X)ChaCha20-Poly1305 is faster than AES-GCM without AES-NI hardware support.

Note that XChaCha20-Poly1305 should be favored over regular
ChaCha20-Poly1305 in many cases because it allows for random
nonces, which helps prevent nonce reuse (please see note 1).

If you just need a counter nonce or intend to use a unique key for encryption each time,
then ChaCha20-Poly1305 is fine. Unfortunately, there are two ChaCha20-Poly1305
constructions - the original ChaCha20-Poly1305 and ChaCha20-Poly1305-IETF.

The original construction is better because it has a smaller nonce, meaning
it doesn’t encourage unsafe random nonces, and a larger internal counter,
meaning it can encrypt more data using the same key and nonce pair
(please see note 5), but the IETF variant is more popular.


( AES-CTR | CBC ) ➜ HMAC

Encrypt ➜ MAC

Again, if you know what you are doing, this is superior to using an
AEAD in terms of security for the reasons outlined in point 1 above.

AES-CTR should be preferred because AES-CBC is less efficient,
requires padding, and doesn’t support a counter nonce.

However, both AES-CTR-then-HMAC and AES-CBC-then-HMAC
can be faster than AES-GCM without AES-NI hardware support.

With that said, generating an IV for CBC and CTR can be a source of trouble,
with CBC requiring unpredictable (aka random) IVs and CTR implementations
differing in terms of nonce size and whether a random / counter nonce is safe.


AES-GCM

The industry standard despite it not being the best and receiving some criticism.

It’s easier to use correctly than Encrypt-then-MAC and faster than (X)ChaCha20-BLAKE2b,
(X)ChaCha20-Poly1305, and AES-CTR-then-HMAC/AES-CBC-then-HMAC with AES-NI hardware
support, but it has a weird nonce size (96-bits) that means you should use a counter nonce,
some implementations incorrectly allow 128-bit nonces (only use a 96-bit nonce since longer
nonces get hashed, which could result in multiple nonces producing some of the same AES-CTR
output), reusing a nonce is more catastrophic than in AES-CBC for example, and there are
relatively small max encryption limits (e.g. ~350 GiB for a single key when using
16 KiB long messages).

Furthermore, there can be side-channels in software implementations
and mitigating them reduces the speed of the algorithm.

Therefore, AES-GCM should only be used when there’s hardware support,
although I strongly recommend the above algorithms instead regardless.




Avoid 「 Unordered | All Unsuitable 」


Your own custom symmetric encryption algorithm

Even experienced cryptographers design insecure algorithms, which
is why cryptographic algorithms are thoroughly analysed by a large
number of< cryptanalysts, usually as part of a competition.


AES-ECB

Identical plaintext blocks get encrypted into identical ciphertext blocks,
which means the algorithm lacks diffusion and fails to hide data patterns.

In other words, it’s horribly insecure.


RC4

There are lots of attacks against it, rendering it horribly insecure.


AES-CBC 〕〔 AES-CTR 〕〔 ChaCha20

And other unauthenticated ciphers without a MAC

This allows an attacker to tamper with the ciphertext without detection and can
sometimes allow for other attacks, like padding oracle attacks in the case of AES-CBC.


DES 〕〔 GOST 〕〔 Blowfish 〕〔 3DES
RC2 〕〔 IDEA 〕〔 CAST-128

And Any Other 64-bit Block Cipher

A 64-bit block size means collision attacks can occur after
encrypting a certain amount of data using the same key.

Don’t use any algorithm with a block size less than 128-bits.

Algorithms like DES and 3DES are also very old
and have small key sizes that aren’t secure.


One-time pad

Completely impractical since the key needs to be the same size as the message,
and a true random number generator (atmospheric noise) is required to
generate the keystream for it to be impossible to decrypt.

Furthermore, some people think an XOR cipher with a repeating key is equivalent
to a one-time pad, which is completely false. Repeating the key is horribly insecure.

Never do this.


AES - ( CCM | AEX | CFB | OCB ) 〔 Twofish 〕〔 Threefish
RC6 〕 〔 ARIA 〕 〔 SEED 〕 〔 Serpent 〕 〔 Camellia

And Other Ciphers Nobody Uses

Very few people use these because they’re worse in one way or another, except for AES-OCB,
which has very good performance but is almost never used because it was patented until recently.

For example, AES-CCM uses MAC-then-Encrypt and CBC-MAC, AES-EAX is slower than AES-GCM
and uses OMAC, some of them are unbalanced in terms of security to performance (Serpent
is slow whilst having a high security margin), some have received limited cryptanalysis,
and implementations of uncommon non-AES algorithms are very rare in mainstream
cryptographic libraries, with random implementations found on GitHub being less
likely to be secure because these types of algorithms can be hard to implement correctly.


AES - ( XTS | XEX | LRW | CMC | EME )

and other wide block/disk encryption only modes

These are not suitable for encrypting data in transit.

They should only be used for disk encryption, with AES-XTS being preferred since it’s popular,
more secure than some other disk encryption modes, less malleable than AES-CBC and AES-CTR
(tampering causes random, unpredictable changes to the plaintext), and ordinary authentication
using an AEAD or Encrypt ➜ MAC cannot be used for disk encryption because it would require
extra storage and slow down read / write speeds, among other things.


AES - ( GCM-SIV | SIV )

These don’t provide unlimited protection against nonce reuse, they’re slower
than regular AES-GCM, they’re rarely available in cryptographic libraries, they
rely on MAC ➜ Encrypt, and AES-SIV uses CMAC.

If you need nonce-misuse resistance, then you should ideally use XCHACHA20 ➜ MAC
or XChaCha20-Poly1305 with a randomly generated nonce or a nonce derived alongside
a subkey for encryption using a KDF or MAC, as described here and here.

If this isn’t possible for some reason, then use AES-GCM-SIV.


(X)Salsa ( 20 | 20-Poly1305 )

There’s no reason to use these when (X)ChaCha20
has better diffusion and performance.

However, (X)Salsa20 is still very secure.

Also, as mentioned in point 4, you shouldn’t use (X)Salsa20
on it’s own (without a MAC) because authentication is extremely important.




Notes


1

Never reuse a nonce / IV with the same key (never hardcode a nonce / IV)

Doing so is catastrophic to security.

You must either use a counter nonce, a KDF generated nonce / IV, or a
randomly generated nonce/IV,depending on the algorithm you’re using.

For instance, you should use a counter nonce (starting with 12 bytes of zeroes)
with ChaCha20-Poly1305 and AES-GCM because the small nonce size
(64 or 96-bits) means random nonces are not safe unless you’re
encrypting a small amount of data per key, but you can use a random
or counter nonce safely with XChaCha20-Poly1305 (192-bits).

Then AES-CBC requires an unpredictable (aka random) 128-bit IV, and some
implementations of AES-CTR need a random nonce too, although most involve
using a 64 or 96-bit counter nonce or you have the same problem as with AES-GCM.

Note that if you always rotate the key before encrypting (never encrypting anything with
the same key more than once), then you can get away with using a nonce full of zeroes
(12 bytes of zeroes for AES-GCM), but I generally wouldn’t recommend doing this,
especially if you have to use a 128-bit key, which I again don’t recommend
(please see the Symmetric Key Size section), since this can lead to multi-target attacks.


2

Prepend the nonce / IV to the ciphertext

This is the recommended approach because it’s read
before the ciphertext and doesn’t need to be kept secret.

However, if you’re performing key wrapping (encrypting a key using another key),
as described in point 6 below, then you could encrypt the nonce / IV too as an
additional layer of protection.


3

Never use string variables for keys, nonces, IVs, and passwords

These parameters should always be byte arrays.

String keys are just passwords, meaning they’re not suitable for use as keys directly
Please see the Password-Based Key Derivation section

Furthermore, strings are immutable (unchangeable) in many programming
languages (C#, Java, JavaScript, Go, .. ), meaning they can’t be zeroed out
from memory (please see point 7 below).


4

Avoid encryption functions / APIs that include a password parameter

These often use dated or insecure password-based KDFs that shouldn’t be used.

Instead, use one of the recommended password-based KDFs yourself to derive an
encryption key for an AEAD or an encryption key and MAC key for Encrypt ➜ MAC.


5

AEADs often have limits on the amount of data they can safely encrypt using a single key

For AES-GCM, you can encrypt ~64 GiB using a key and nonce
pair and ~350 GiB (assuming 16 KiB messages) with a single key.

For ChaCha20-Poly1305-IETF, you can encrypt 256 GiB using a key and nonce pair.

XChaCha20-Poly1305 and the original ChaCha20-Poly1305
constructions have no practical limit (2^64+ bytes).

Make sure you follow the recommendations below to ensure that these limits are never reached.


6

Ideally, use a new key for each message (except when chunking the same message)

This helps prevent cryptographic wear-out (using a single key to encrypt too much data),
nonce reuse, and reusing keys with multiple algorithms whilst being beneficial for security
in that a compromise of one key doesn’t compromise data encrypted under different keys.

One common way of doing this is to randomly generate a unique data encryption key
(DEK) for each message, encrypt the DEK using a key encryption key (KEK) derived using
a key derivation function (KDF), and then prepend the encrypted DEK to the ciphertext.

You can then decrypt the DEK and use it to decrypt the ciphertext.

Alternatively, you can derive unique keys using a random salt with a KDF, although this
is inefficient when using a password-based KDF since it means a delay for every message.


7

Erase secret keys from memory as soon as possible

Once you’ve finished using a secret key, it should be zeroed out from memory to prevent
an attacker with physical or remote access to a machine being able to retrieve it.

Note that in garbage collected programming languages, such as C#, Go, and JavaScript,
this is difficult to achieve because the garbage collector can copy secrets around in memory.

However, attempting to erase sensitive data from memory is better than doing nothing.


8

Encrypt large amounts of data in (16 - 64 KiB) chunks

This lowers memory usage, reduces attack boundaries for AEADs,
allows for more encryptions under the same key with AEADs, means
that a corruption in a ciphertext might only affect one chunk rather
than rendering the entire message unrecoverable, and enables the
detection of tampered chunks before an entire message is sent in
an online scenario.

However, this is tricky to get right because you need to add and remove
padding in the last chunk (using an encrypted header to store the length
of padding or a padding scheme, as explained in point 13 below) and
prevent chunks from being truncated (using the total ciphertext length
as additional data), reordered, duplicated, or removed (using a counter
nonce that’s incremented for each chunk), so you should ideally use or
replicate an existing API, like secretstream() in Libsodium.


9

Don’t just use a standardised AEAD (AES-GCM, ChaCha20-Poly1305, .. )
if you’re performing password-based encryption in an online scenario

AEADs are not key committing, meaning they are susceptible to
partitioning oracle attacks, which speed up password recovery.

To solve this problem, you can either use Encrypt ➜ MAC following the instructions
later on in this Notes section, or you can apply a fix whilst still using an AEAD.

Note that both methods will be slower than not having key commitment, but it’s
important to prevent this attack and the other issues explained in point 10 below.

The fix I’d recommend involves deriving an encryption key and a MAC key using a KDF,
encrypting the message using an AEAD with the encryption key, retrieving the
authentication tag from the end of the ciphertext, and prepending a MAC of the
encryption key, nonce, and AEAD authentication tag to the ciphertext.

HMAC( Message : Encryption Key | Nonce | Tag , Key : Mac Key )


For decryption, you derive the encryption key and MAC key again and verify the MAC
in constant time (see point 17 below) before decrypting the message using the AEAD.

An example of this fix can be found here.


10

Standardised AEADs (AES-GCM, ChaCha20-Poly1305, AES-GCM-SIV, .. )
aren’t key or message committing

The lack of key commitment means that a ciphertext can be
decrypted using multiple keys to different but valid plaintexts.

This won’t reveal the original message, but it could result in the corruption of data or
different plaintexts that appear to be valid file formats, which is not what you want.

The AWS Encryption SDK now recommends and defaults to using key commitment.

As explained in point 9 above, this can especially cause
problems when performing password-based encryption.

To fix this problem, you should either use Encrypt ➜ MAC
instead or apply the fix for AEADs outlined above.

The lack of message commitment means that an attacker who knows
the key can find other messages that have the same tag, allowing them
to trick two parties using the same key into believing that they received
the same message when they actually received different messages.


11

Make use of the additional data parameter in AEADs

This parameter is useful for binding context information to a ciphertext
and preventing issues like replay attacks and confused deputy attacks.

It’s often used to authenticate things like headers,
version numbers, timestamps, and message counters.

Note that additional data is not part of the ciphertext; it’s just
information, included in the computation of the authentication tag.

You either need to store additional data securely in some sort of database
(in the case of a user’s email address being used as additional data) or be
able to reproduce the additional data when it’s time for decryption
(using a file name as additional data).


12

If an attacker knows the encryption key, then they can still decrypt
an AEAD encrypted message without knowing the additional data

For example, they can use AES-CTR with the key to decrypt an AES-GCM
encrypted message, ignoring the authentication tag and additional data.


13

Pad messages before encryption if you want to hide their length

Stream ciphers, such as ChaCha20 and AES-CTR (used in AES-GCM), don’t perform
any padding, meaning the ciphertext is the same length as the plaintext.

This generally isn’t a concern for most applications, but when it is,
you should use ISO/IEC 7816-4 or PADME padding on the message
before encryption and remove the padding after decryption.

This padding algorithm is more resistant to some types of attacks than
other padding algorithms and always reversible, unlike Zero Padding .

Such padding can be randomised or deterministic, with both techniques having pros and cons.

Encrypting data in chunks, as described in point 8 above, is an example of deterministic
padding since the last chunk will always be padded to the size of a chunk.


14

Stick to Encrypt ➜ MAC

Don’t MAC ➜ Encrypt or Encrypt + MAC because both can be susceptible to
attacks, whereas Encrypt ➜ MAC is always secure when implemented correctly.

Encrypt ➜ MAC is the standard approach and is what’s used in non-SIV (aka most) AEADs.

The only exception to this rule is when implementing an SIV AEAD to have
nonce-misuse resistance, but you should ideally let a library do that for you.


15

Always use separate keys for authentication and encryption

This is considered good practice, even though reusing the same key can be theoretically fine.

In the case of a password-based KDF, this can be done by using a larger output
length (96 bytes) and splitting the output into two keys (256-bit and 512-bit).

In the case of a non-password-based KDF, you can use the KDF
twice with the same input keying material but different context
information, salts, and output lengths for domain separation.

Please check the Symmetric Key Size section for MAC key size


16

Always MAC the nonce / IV (and everything in the message – file headers too)

If you fail to authenticate the nonce / IV, then an attacker can tamper with it undetected.

AEADs always authenticate the nonce for this reason.


17

Always compare secrets and MACs in constant time

If you don’t compare the authentication tags in constant time, then this can lead
to timing attacks that allow an attacker to calculate a valid tag for a forged message.

Libraries like Libsodium have constant time comparison functions that you can use to prevent this.


18

Concatenating multiple variable length parameters when using a MAC can lead to attacks

HMAC (
    Message : Additional Data | Ciphertext
    Key : Mac Key
)

Please see point 5 of the Message Authentication Codes notes


19

Cipher Agility Is Harmful

Less is more in the case of supporting multiple ciphers / algorithms because more
choices means more can go wrong, which is one reason why WireGuard is regarded
as superior to OpenVPN and TLS 1.3 supports Fewer Algorithms than TLS 1.2.

Cipher agility has caused serious problems, like in the case of JWTs .

Also, in the case of programs like GPG and VeraCrypt,
customization can allow the user to worsen their security.

Therefore, choose one secure Encrypt ➜ MAC
combo or AEAD recommended above, and that’s it
.

If the algorithm you chose gets broken, which is extremely unlikely
if you’re following these guidelines, then you can just increment the
protocol / format version number and switch to a different algorithm.


20

Cascade Encryption Is Unnecessary

Although I’ve written a cascade encryption library based on TripleSec called DoubleSec ,
cascade encryption is significantly slower and solves a problem that pretty much
doesn’t exist because algorithms like ChaCha20 and AES are nowhere near broken
and other issues are more likely to cause problems.

Furthermore, it’s a hassle to implement yourself compared to
using a single algorithm, with more things that can go wrong.

Therefore, unless you’re extremely paranoid (in an Edward Snowden type situation)
and don’t care about speed at all, please don’t bother.




Discussion


Not everyone will agree with my recommendation to use
Encrypt ➜ MAC over AEADs when possible for the following reasons:


1

It’s easier to implement an AEAD

You don’t need to worry about deriving separate keys, appending and
removing the tag, and comparing authentication tags in constant time.

AEADs also make it easy to use additional data in the calculation of the tag.

This should mean fewer mistakes.


2

AEADs are typically faster

AES-GCM with AES-NI instruction set support is very fast,
AES-OCB is even faster, and ChaCha20-Poly1305 is also
fast without the reliance on hardware support.


3

It’s easier to chunk data with an AEAD:

Encrypt ➜ MAC normally involves encrypting all the data in one go
and appending one authentication tag at the end, which requires
loading the entire message into memory and means a corruption
renders the entire message unrecoverable.

Whilst you can also do this with AEADs, it’s recommended to chunk messages, as explained
in point 8 of the Notes, meaning the ciphertext contains multiple authentication tags.

This is trickier with Encrypt ➜ MAC unless you’re using a library that offers it as a function.


My Response To These Arguments


1

Yes, AEADs are simpler, which is exactly why we need committing AEADs and Encrypt ➜ MAC
implementations to be standardized and included in cryptographic libraries.

Unfortunately, this isn’t happening because everyone is busy promoting non-committing AEADs.


2

Whilst this is often true, except for AES-GCM without AES-NI support, Encrypt ➜ MAC,
especially using MACs like BLAKE2b and BLAKE3, is not slow enough for this to be
considered a serious problem, particularly in non-interactive / offline scenarios or
when dealing with long-term storage.

In fact, using BLAKE3 with a large enough amount of data can be faster than Poly1305 and GMAC.

Moreover, I would argue that the additional security makes up for the loss in speed.

(X)ChaCha20-Poly1305 and AES-GCM are not designed
for long-term storage, whereas Encrypt ➜ MAC is.


3

This is another reason why Encrypt ➜ MAC implementations like
(X)ChaCha20-BLAKE2b should be included in cryptographic libraries.

If they were, then you could call it like any other AEAD.

For instance, I made a ChaCha20-BLAKE2b library to allow me to do this.


So When Should You Use An AEAD?

Exceptions to my Encrypt ➜ MAC recommendation include when


1

Maximum performance is necessary when using public-key cryptography

For example, in online scenarios that don’t involve passwords
and storing data long-term, such as TLS 1.3 and WireGuard .

This is what AEADs are designed for.


2

You’re not comfortable implementing Encrypt ➜ MAC

If there’s no decent library you can use (Tink isn’t available in your language)
or copy code from (make sure you respect the code license!),
then you’re more likely to implement an AEAD correctly.

However, implementing the fix I recommend for partitioning oracle attacks
(please see point 9 of the Notes), which affect online password-based encryption
scenarios, requires knowing how to use a MAC, so at that point, you may as well
use Encrypt ➜ MAC, especially if you’re storing data long-term.

The lack of key commitment could also theoretically lead to data loss even when
partitioning oracle attacks aren’t a threat (in offline scenarios), meaning you should
also implement the fix if that concerns you, like Amazon has for their Encryption SDK.

With enough research and attention to detail, Encrypt ➜ MAC can be implemented correctly by anyone.




Overview