Cryptography

Cryptography is magic!

Cryptography is THE pillar of computer security. It relies on algorithms that mathematically transform your data to ensure authenticity (certainty as to the identity of the data sender), integrity (certainty that the original data is intact) and confidentiality (certainty that no-one other than the recipient will be able to read it).

Algorithms.

Numerical system.

An algorithm is a sequence of instructions or operations designed to solve a problem. The problem we're talking about in cryptography involves factoring very large integers (asymmetrical cryptography), calculations on elliptic curves and substitution-permutation operations (symmetrical cryptography). These calculations are carried out by computers, which operate with base-2 numeration systems, in which data is broken down into the smallest existing unit called a bit. This bit can only take 2 values, either 0 or 1. This technology is capable of calculating cryptographic keys, and therefore of encryption. However, it is not capable of doing the opposite (breaking an encryption) in an acceptable time.

Quantum system.

Quantum system

Now imagine that the smallest unit decomposing the data can take not just 1 of 2 values, but 2 of an infinite number at the same time!

This is not possible with digital technology, but it is with quantum technology, in which the smallest unit is called a quantum bit or qubit. When digital technology is forced to perform sequential calculations, quantum technology enables them to be parallelized. Intuitively, this increases computing power tenfold. The consequence is that the calculations needed to "break" current algorithms based on factoring very large numbers or dicrete logarithms (asymmetric cryptography) become a reality.

It's true that computers based on this technology won't be on the market tomorrow, and it's even estimated that the 1st quantum computer won't be on the market for at least 10 years... However, Apple (in iMessage), Signal, Tuta Mail have already introduced post-quantum algorithms into their encryption.

But why?

  • Firstly, just because the first quantum computers are announced in 10 years' time doesn't mean they won't be in operation before then: history teaches us that the announcement of the “breaking” of an encryption is rarely public (see Enigma during WW2).
  • Secondly, because an attack technique called “harvest now, decrypt later” is already in place: its principle is to collect encrypted data today, based on long-lasting cryptographic keys, in order to decrypt it later, when the necessary technology is available.

Data confidentiality: ENCRYPTION.

Objective: To ensure that only the recipient of a message can read it. It is transformed in such a way that no one will be able to understand it without being in possession of the key that opens access to it. Encryption can be symmetrical (a secret key shared between users enables encryption and decryption) or asymmetrical (a public/private key pair is used to encrypt/decrypt).

Symmetric encryption (also known as secret-key).

Principle.

The sender uses a secret key (a sequence of 128 or 256 bits) to transform the data he wishes to protect into encrypted data, then sends the encrypted message to the recipient via the channel of his choice. The recipient receives the encrypted message, and then uses the same secret key as the sender to decrypt it and thus learn about the original message.

Symetric encryption
Par MarcT0K (icônes par JGraph) CC BY-SA 4.0

Example of the result of encrypting the word “Hello” using the AES (Advanced Encryption Standard) algorithm and a 256-bit random key:

---------- ciphertext ----------
9jua5EL34mZTva/NuNirTA==
------- end of ciphertext ------

The disadvantages of symmetric encryption:

  • You need to be able to exchange the secret key over a secure channel.
  • There are as many key pairs to manage as there are users.

How it works.

Algorithms encrypt either by blocks (AES, Serpent, Twofish) or by streams, also known as bits (ChaCha20), the latter being faster than the former when used without specific hardware support.
In the case of block ciphers, the data is cut into 128-bit blocks and then encrypted with the key (generally between 128 and 256 bits in size).
The function used by the algorithm is iterated a certain number of times (14 in the case of AES-256): each iteration (called a round) is the encryption of the previous result by a mathematical function applied with a new mathematical key.
This is what makes encryption robust (a related key attack on AES-128 with only 8 rounds was published in 2011).

Asymmetric encryption (also known as public-key).

Principle.

A user generates a key pair and broadly distributes the public key (key server). Anyone wishing to send him a message can do so by encrypting it with the user's public key. On receipt of the encrypted message, the user will use his or her private key to decrypt it, so that only he or she can read the message. The advantage of this solution is that there's no need to exchange keys securely, as the public key is only used for encryption and is therefore available to everyone.

Asymetric encryption
Par MarcT0K (icônes par JGraph) CC BY-SA 4.0

Example of the result of encrypting the word “Bonjour” using the RSA (Rivest Shamir Adleman) algorithm and a 4096-bit random key (note that the result is much larger than with symmetrical encryption):

-----BEGIN PGP MESSAGE----- wcFMA0+nHCDs9qP5ARAAsrt0RSGo/8tLVp9xEgQVmrW7OgFgdXdMSs7B0NKITGOjPvBfYbkExFgL S7KU6QkBQO1hdTKaP5n7GwJcIrhCvDL7pLXJfRFvqLcfYEkBjSWzz2Cyt/EZq0B/XtmxiWg8JJ+V uKjSGBMRp1ksWAuFJvIyhDUOuvU2FEC1hC7jn5oSZr1Z4O2Kw8KtIR+ig8Ehscc3RzBH/GTRszLo 5xwFBdoz6cDGP98V/ezRTylZpoPfhaW91//DBtKBKWPnE+8rCdxXHyGIDxEkAG1DIvZoJtRh4HBZ v5+5l+P5WjFo6BYU8uwx2WXbS0vUrMy8Z8bD9oME+c6Yy/8RLwABy/nn7T/ejpq6MgnP57P9LkoW dD+8jrQgXQfXEhgWxON+QuXHBq5aoL/Avr9KNwNN7M+XNwXmgS4gkuHOevLNExSpnAnPAOIPKSf/ qEjm8yjFujLgt7wXOzosvHgVemwaK8OS0UnqAyoZL+6Y6uYK3VzzCG25+3TCVjcizjs1muQhvMgn PK14UO6UYSgnisVx/Q7KYyciJYrfGxcfWDXGce6xJS6ktEUTLltuqEiJUSjktl7jbHEOpYF7Pxdf sAzdsFNMZ65FHFYyIZ/Vg6glV4PaaH3QD4AWXMV+T6MM/cwzw+BhQvSbSjY12cHDf6mfQs5O2+sw U/8/KaGEOMa1tcuNAdPSRAHfJeEDCyASipk1qW6ecY3UvKxBqxxauVdzu2LCiKmAzpKc1KkpNKyx xpcHF+5IBPPNFHjFw6T6txuNH7UKc7+a1Rl8 =7arP
-----END PGP MESSAGE-----

The disadvantages of asymmetric encryption:

  • You need to be able to verify the authenticity of the other party's public key.
  • It is slower and more resource-intensive than symmetric encryption.

How it works.

The algorithms use the mathematical properties of prime numbers (RSA), elliptic curves over finite fields (ECC), or dicrete logarithm problems (Diffie-Hellman key exchange). Let's not go into too much detail, but a little reminder for our general knowledge might not be useless... By the way, what's a prime number again? (it's a long way from 5th grade...): it's an integer that has only 2 positive divisors, 1 and itself. 1 isn't an integer because it's only divisible by 1 (i.e. by 1 integer), 4 isn't an integer because it's divisible by 2, 1 and itself (i.e. by 3 integers), but 2 is an integer because it's only divisible by 2 and 1.
These prime numbers have many characteristics, the simplest of which is: any integer that is not prime is the product of at least 2 primes whose decomposition is unique. The other characteristics will not be studied here, but call on the notions of Euclidean division, Euler's indicator, congruence, Bézout's theorem, Greatest Common Divisor, Fermat's theorem (in short, a lot of math).
Let's just remember that:

  • The public key used for encryption is generated from the private key using a mathematical function.
  • The reversibility of this function is not theoretically impossible.
  • However, solving it is beyond the reach of today's technical resources when it comes to factoring very large integers into products of prime numbers.

Solving this reversibility problem means breaking the encryption, since if we can calculate the inverse of the function that produced the public key, we obtain the private key and are thus able to decrypt the message.

Security of current encryption methods.

As a preamble: testing all the combinations of a 256-bit digital key is equivalent to testing 2256, or 1,158x1077 possibilities. This would take billions of years for a supercomputer capable of testing billions of keys per second.

  • Today, brute-force attacks have no effect on cryptographic algorithms.
  • Other attacks on symmetric encryption (apart from the related key attack mentioned above) target the hardware implementation rather than the algorithm itself (auxiliary channel attack).
  • However, the situation is not quite the same for public key cryptography, due to the calculation method used in its encryption algorithm:
    • Using the distributed calculation method (when several computers work together), a long number of 795 bits has been factored.
    • The keys therefore need to be much larger than for symmetrical encryption algorithms for a similar level of security (4096 vs. 256 bits).
    • It is vulnerable to Shor's quantum algorithm.

The greatest danger facing asymmetric encryption today, however, lies in its implementation, and is known as the Man In The Middle Attack (MITM):

  • The attacker intercepts the public keys exchanged and replaces them with his own.
  • The sender will then encrypt the message with the attacker's public key instead of the original recipient's, without knowing it.
  • The attacker can then decrypt the message with his private key and modify it. He will then re-encrypt it with the recipient's public key before sending it back to him.
  • The recipient will decrypt the message with his private key, unaware that it has been altered by the attacker.

Hybrid encryption.

It solves the problems of symmetric or asymmetric ciphers used alone:

  • Asymmetric encryption is used to encrypt a secret key with the recipient's public key and then transmitted it to the recipient.
  • Symmetric encryption is used to encrypt/decrypt with the secret key thus secured.

This solves the problem of key distribution and speeds up encryption. These encryptions are used, for example, to secure Internet communications (TLS), network connections (SSH), the encryption of certain emails (OpenPGP), etc...

Post-quantum encryption.

Although symmetrical algorithms such as AES 256 are considered resistant to quantum attacks, the same cannot be said of asymmetrical algorithms. For this reason, some encryption vendors have already (or are in the process of) implementing post-quantum algorithms in their applications.
Their solution is generally to combine traditional cryptographic algorithms (such as AES and RSA) with post-quantum algorithms such as CRYSTALS-Kyber. In 2022, NIST selected a first batch of four proposed algorithms that are likely to become the standards for post-quantum encryption: Kyber encryption for secure access to websites, and the Sphinx+, Dilithium and Falcon systems for digital signatures.

Data integrity: HASHING.

Objective: Ensure that the data you receive is not altered. Secondarily, to transform the data into a much lighter (less memory-intensive) data of fixed size. To illustrate the principle, the Wikipedia example is very telling:

Data hashing

A hash function transforms data into a unique fingerprint (also known as a condensate, hash or hash). The function used is such that it cannot be used to retrieve the original data from the hash. Its sole purpose is to compare fingerprints to check whether they are identical. As we'll see in the section on passwords, the irreversibility of the hash function is a key element in the storage of passwords on a server.

Risk: Use an algorithm that generates “collisions”, i.e. that can generate an identical fingerprint for different data. This is the case with the MD5 (Message Digest) algorithm, which is still widely (too) used to protect passwords stored on servers.

Data authenticity: DIGITAL SIGNATURE.

Objective: Ensure the identity and integrity of a document's author.

Digital signature

Asymmetrical encryption reverses the use of keys with respect to encryption. The sender hashes the message and encrypts (signs) it with his private key. This guarantees its integrity (hash) and identity (possession of the private key). The recipient decrypts it using the sender's public key.

The digital signature has legal value (officially recognized in Europe since January 2000), and provides the same guarantees as the handwritten signature, including its non-repudiation function (the impossibility of denying having sent the message). It should also be noted that the signature cannot be reused; it is part of the signed document.

Risk: Using an insufficiently strong algorithm, problems related to the distribution of the public key and compromise of the private key.

Some examples of algorithms :

Algorithm type Recommended Obsolescent
Symmetric encryption [CHACHA20 / AES-256 /Twofish / Serpent] [DES]
Asymmetric encryption [RSA / Ed25519] [RSASSA]
Post-quantum encryption [Kyber / Sphinx+ / Dilithium / Falcon]
Hash [SHA-256 / SHA-3] [SHA-1]
Password storage [PBKDF2 / ARGON2 / Scrypt] [MD5]

And you, in your apps, what algorithms do you use?

Encryption technology must be used when you surf, communicate, store identifiers :

  • If encryption is optional, it must be activated.
  • If your application does not offer encryption, you should leave it out.