Password

The Password, “class in Dallas”!

It's The security topic par excellence, because you have as many logins as you have accounts on Web sites: bank accounts, social security, taxes, commercial sites, etc... you can reach over 100 accounts without forcing yourself! To authenticate yourself to these accounts, you need to enter a unique identifier (a number or email address associated with you) and a knowledge factor (what you know), a possession factor (what you have), or an inherent factor (what you are). The most common factor currently used is still knowledge, which most often takes the form of a password. This password must therefore be robust. Firstly, when you create it, by respecting precise characteristics that we'll detail a little further on (this part is your responsibility), but also when it's stored on the server of the site on which you're going to log in (and here it's the host site's responsibility). Since you need to understand how the system works before applying it, here goes:

What happens when I want to authenticate myself on a site?

1 - Server side :

Obligations.

As the login/password pair is the key to accessing your account, the site on which you are trying to authenticate yourself compares the data you have entered with the data stored in a dedicated database: your entry matches the server's information => access granted.

It's immediately obvious that the security of this database is an important factor. So important, in fact, that Article 121 of the French Data Protection Act gives it the force of law: "The controller, taking into account the nature of the data and the risks posed by the processing, will take all necessary precautions to maintain the security of the data and, in particular, to prevent it from being distorted, damaged or being accessed by unauthorized third parties".

The means to be implemented are specified: “Passwords must never be stored in clear text. When authentication takes place on a remote server, and in other cases if technically feasible, the password must be transformed using a secure, non-reversible cryptographic function, including the use of a salt or a key. Specialized functions exist today to meet this need, such as Scrypt or Argon2, cited by ANSSI”.

Be aware that if a site administrator fails to take the necessary precautions, he or she is liable to heavy penalties: “The CNIL may inspect any data controller, either on the basis of a complaint received or on its own initiative, whether remotely, online, on the basis of documents or on the premises of the organization concerned. In the event of serious breaches of security principles, it can then mobilize its entire repressive chain and impose sanctions of up to 4% of worldwide sales or €20 million”.

Before going any further, it is instructive to take a closer look at the best practices put in place by our IT friends to comply with these directives. And it's to be hoped that they are well up to date with the state of the art (as the large number of successful “hacks” every year may raise questions...). CNIL also mentions “specialized functions such as Scrypt or Argon2”, this is quite mysterious. What are we talking about?

Technical resources.

We have just seen that it is forbidden in France to store users' passwords on a server in clear text, so there are 2 possible solutions for complying with this principle: encryption or hashing. Encryption Encryption is the action of transforming a message or data using cryptography (from the ancient Greek “kruptos” or hidden and “graphein” or to write). It's nothing new, and certainly not new to computing. There has always been a need to protect messages from curiosity, and the code Caesar (the emperor) is often mentioned as one of the first to have implemented it. To make his messages incomprehensible, he shifted the letters by 3 positions (e.g. D instead of A). This shift represents the encryption key for this method. As a reminder, there are 2 types of encryption key, symmetric (the same key is used to encrypt and decrypt) and asymmetric (a pair of keys is used, the public one to encrypt and the private one to decrypt).

Without going into detail again, the very principle of encryption (and not decryption) is to be reversible. You absolutely must be able to recover the original data in clear text once it has been processed by the decryption key. So, however secure this technique may be, it is not suitable for storing passwords.

Hashing Hashing is the application of a function which means that passwords are not stored in the database in clear text, but only as an imprint. The direct consequence is that, even with the hashed password, the attacker will not be able to authenticate in your place. As the functions used for this hash are non-reversible, they meet the need => validated.

The hash function transforms a piece of data of any size into a single piece of data of the same size. This size will of course vary according to the function chosen. ANSSI recommends the use of Scrypt or Argon2 hash functions, as they are slower to compute and more resource-intensive, and therefore more complicated to “break”. For the record, Argon2 is the winner of the Password Hashing Competition. I won't go into the details, but you can take a look on the net if you're interested. The important thing to remember is that good practice (and common sense) dictates not to use obsolete algorithms, such as the MD5 and SHA-1 hash functions. And if you don't? Well, it's as if they weren't hashed! The following 2 strings are the MD5 hash results of 2 of the world's most frequently used passwords. Type them into a search engine and you'll see... 21232f297a57a5a743894a0e4a801fc3 e10adc3949ba59abbe56e057f20f883e

MD5 Hash for Direct Internet Play

To guard against the likelihood of hashes being converted into passwords on the web, as in the above cases, the administrator can reinforce the security of the hash by adding a salt, which will be unique for each user. The salt is a random string of characters added to the password before it is hashed. This makes the hash unique even if 2 users use the same password, and prevents the hash from being read on the Internet (see above) or from being present in a Rainbow Table (see password attack techniques below). Another way of adding spice to the password before hashing is to use a pepper. Unlike salt, this string of characters added to the password is the same for all users, but is then no longer stored in the database but, for example, present in the application's sources. Finally, the hash can be iterated several times to increase its efficiency.

To sum up: administrators wishing to protect their users' passwords have the means (and the obligation) to do their job properly: Password hashing of varying degrees of finesse, using one of the recommended algorithms, salting and peppering.

Now let's look at best practice on the user side, since this is where your responsibility lies.

2 - The user's side:

Now that we've covered the responsibilities of the sites, let's take a look at ours, as users, and what "security ” means when it comes to passwords. The CNIL's full recommendations on this subject (again) will be useful reading: https://www.cnil.fr/fr/securite-authentifier-les-utilisateurs. To sum up, the necessary length and entropy of a password is measured in bits, the yardstick for your security. 128 bits is recommended, representing the key size of the standard AES encryption algorithm.

Necessary password “strength” depending on the conditions of use (source ANSSI) :

  • 13 bits in the case of user-held hardware with blocking after 3 failures: 4-digit PIN code (between 0 and 9) if accompanied by a restriction.
  • 50 bits in cases where additional measures have been implemented, such as access timeout on failure: “classic” 8-character password chosen from 62 (0-9, A-Z, a-z).
  • 80 bits in other cases: password of at least 12 characters chosen from 90 (alphanumeric, shift/min, extended special characters).
  • ANSSI now recommends > 100 bits, corresponding to 20 characters.
Passwords strength

Despite what we've read here and there, the length of the password has more influence on its “strength” than the diversity of its characters, as it is calculated with the formula N exponent L, (N being the number of possible characters and L the length). In the case of a password used to connect to a sensitive account, a complex password of at least 20 characters is required, while 12 may suffice for a less sensitive account. The table below illustrates how long it takes to crack a password using first-rate hardware, depending on the size of the password.

Password hack needed time

Once we've talked about size (and it's clear that this is important, whatever some may say), we need to talk about predictability, i.e. the method used to create the password. A password's security is primarily ensured by its randomness, while its predictability greatly weakens it: it's easy to understand that the password “Monchien@7ans” will be easier to find than “G#jz5La7FE@3y”, even though they're the same size and both use alphanumeric, uppercase, lowercase and special characters... The time it takes to crack a “weak” password shows that in this case, size no longer has the same impact - it no longer protects.

Weak passwords hack needed time

To understand why, let's take a look at the methods used to “crack” passwords. There are more than that, of course, but let's take a look at the 4 most common:

  • Brute-force attack.
  • Use of dictionaries.
  • Use of data leakage.
  • Social engineering**.
Social engineering.

This is a method involving manipulation mechanisms. The cyber-criminal plays on human psychology to induce his target to share confidential information (telephone calls, phishing e-mails, baiting, feelings of urgency, imminent danger or advantage). The results, while by their very nature highly variable depending on the personality of the victims, are generally very good and unstoppable.
=> The solution: inform all users of the risks involved, but there are no guarantees since the target is “consenting”.

Data leakage.

Do you know what the link is between the following companies (non-exhaustive list): YAHOO - ALIBABA - FACEBOOK - DROPBOX - TWEETER - DEEZER - CANVA - DAILYMOTION - AUDI - ADECCO - DOMINO'S PIZZA - FORBES - LEDGER - LINKEDIN - NVIDIA - SEPHORA - SNAPCHAT - ZOOM? As well as being household names, they are also part of the very long list of thousands of companies that have been targeted by hackers who have recovered terabytes of confidential data. The Haveibeenpawned website (https://haveibeenpwned.com/PwnedWebsites) lists the most important ones. Of course, if your password is part of the leaked data, it can be basically as strong as you want it to be, but in practice it will be no more difficult for a hacker to find than 123456. Because beyond being predictable, it's known and listed in a dictionary the 8.4 billion passwords from a compilation of leaks posted on a hacker forum confirm this.
=> Prevention: Keep abreast of leaks (Haveibeenpawned website), don't use the same password for several accounts, use multi-factor authentication (MFA for Multi Factor Authentication or 2FA for Two Factor Authentication).

Dictionaries and Rainbow Tables.

Dictionaries are databases that compile words, names and common expressions and their variations, and the attacker tests them one after the other. The effectiveness of this attack is due to the fact that many people use passwords that they can remember, and which are therefore common and potentially present in these dictionaries (first names, ages, dates of birth, places, etc.). Rainbow Tables are dictionaries containing lists of passwords and their associated hash values. Their use is similar to that of a dictionary.
=> Passwords: strong, randomly-created passwords using a tried-and-tested algorithm; use multi-factor authentication.

The brute-force attack.

This involves testing all possible password combinations, one after the other. The result depends largely on the means used by the attacker, and the strength of the password.
=> Strong passwords, account blocking after a limited number of failed authentications, multi-factor authentication.

Summary :

Type of attack Parry
Brute force attack => [Strong Authentification / MFA]
Dictionaries => [Strong Authentification / MFA]
Data leakage => [MFA]
Social engineering => [Critical thinking]

The 2 previous tables provided by Hive Security, which show the time needed to find a password according to its construction method, illustrate the notion of strong authentication. The strong password illustrates the implementation of strong authentication. Although essential, it can be compromised by certain types of attack. In this case, you can put on “belt and suspenders” and remain secure whatever happens, by deploying multifactor authentication in addition to strong authentication. However, strong authentication is still sufficient against attacks that do not involve social engineering (i.e., when you give an attacker the information he asks for), or in the event of a data leak.

If you think that implementing these measures is a little paranoid or not worth the effort, click on the following link to see some figures on the consequences of password hacking: 210,000 online identity thefts and €1.2 billion in bank fraud every year in France.

Multi-factor authentication.

We've just seen that multi-factor authentication is a big “plus” for password security by adding the production of an additional knowledge factor. This can be a factor of possession (such as a hardware token, mobile authentication, push notification, temporary code received by sms/email or dedicated application), or inherent (biometric or retinal identification). The factors used (the 1st is the password = what I know) must not belong to the same category (the 2nd can therefore not be a knowledge factor as a security issue). As you can see, 2FA is just one possibility of MFA. A good example of this is a cash withdrawal from a bank machine: you use the CB (what you have) to identify yourself, and enter the code (what you know) to validate.

This feature is increasingly available on more sensitive sites, so it's a must. The implementation of multi-factor authentication is, moreover, imposed by a European directive for payment services (DSP2) from September 2019 and for payments over €30 from mid-2021.

The sites that offer this essential feature rarely allow you to choose between different factors, instead defaulting to the solution they feel is most suitable for the greatest number of their users. But you guessed it, they're not all the same, and ease of use doesn't always rhyme with security...

1 - OTP technology.

In simple terms, this uses the principle of symmetrical authentication based on a unique code. The code is generated and hashed by the user, then compared with the site's code to authorize or deny access.

MFA by SMS (Short Message Service) or email. You log in with your login/password pair, and the server sends you a one-time code (OTP for One Time Password), sometimes of limited duration (TOTP for Time-based One Time Password), which you must enter on the site to confirm that it's really you. This method should be avoided, as the transmission channel used (SMS or email) is not secure. It has also been discouraged by the National Institute of Standards and Technology (NIST) since 2016, as it does not prove possession of a specific device.
=> Weakness: The SMS can be captured by an attacker who has successfully implemented the SIM Swapping technique. This enables an attacker to gain control of the victim's telephone number by successfully impersonating the victim's telephone operator, then asking the operator to transfer the number to a SIM card in the victim's possession (on the pretext of a lost smartphone, for example). Email can also be easily compromised (emails are not encrypted).

MFA by authentication application. A secret key is shared between the site on which you identify yourself and an application on your smartphone. This usually takes the form of a QR code or a string of characters (the key) displayed on the site, which is then stored in the smartphone's authentication app. A hashing algorithm will then generate a time-stamped code (TOTP) at regular intervals, for example every 30s, and compress it to obtain a 6-digit code. The code generated by the smartphone app is entered on the website. The site's server compares this code with the one it generates using the same algorithm. If the codes are identical, access will be granted.
=> Disadvantage: Can be perceived as complex by the user, can prevent access to the site if the smartphone is lost or if the application is not synchronized via a cloud. Is it really a 2nd factor if it's present on the smartphone used to log in? The default algorithm used (SHA-1) is obsolete. The presence of the “shared secret” on a database poses a problem in the event of a data leak.

2 - Public key infrastructure technology.

It uses the principle of asymmetric authentication with a public/private key pair.

MFA by push notification. A pair of cryptographic keys is generated during enrolment on the server's authentication functionality. The private key is stored on the smartphone and the public key is sent to the server. When an authentication request is made, a challenge is carried out between the server and the smartphone. A successful challenge validates authentication. This additional factor is easy to use because, although it requires the installation of an application, it is not dedicated to authentication. Nor is there any code to copy or remember.
=> Disadvantage: Requires access to the application and a clear validation path. ID stuffing attacks can cause the target to validate a notification out of boredom. The 2nd factor is dependent on the smartphone on which authentication is launched, so problematic if lost.

MFA by FIDO U2F (Fast Identity Online and Universal 2nd Factor). A cryptographic key is generated by the hardware token (which can take the form of a Yubikey key or a card) and the private key is secured on the hardware device. The public key will be transmitted to the site's server when enrolling for this feature. When connecting to the site for authentication, the server will send a “challenge” to the user. If successful, it will return a digitally signed response authorizing access to the server.
=> Disadvantage: cost and need for a back-up device, otherwise in the event of loss, access to servers on which this functionality is activated will be impossible.

Complementary: Adaptive MFA with contextual information. Authentication to a server will require an additional factor only after :

  • A certain number of failed connection attempts.
  • A different ip address.
  • A new geolocation.
  • A new device or OS.
  • etc.

Taking a look at the various 2nd factors mentioned, we can see that they're all primarily based on the possession factor. But what about the inherent factor mentioned above? Please note that we're not talking about using the inherent factor to prove that you're the owner of your terminal (in that case, we're talking about the possession factor). Rather, it's a matter of comparing the recording made locally on your smartphone with the one present on a remote server. The inherent factor, which is robust in terms of security, in fact leads to other problems linked to the storage of these very specific data that are our biological fingerprints (fingerprints, facial or retinal prints). It is therefore not used in consumer applications, and reserved for very specific sectors.

MFA authentication security:

Robustness of the factor
OTP by SMS/Mail => Obsolete
Push on Line => Good level
TOTP => Good level
U2F Fido => Top

Once you've digested all this information, you might be thinking: that's all well and good, but password security isn't for me, it's too complicated! It's impossible to remember a long, complex password for each of my sites! And on top of that, I have to add multi-factor authentication!

Presented like that, it doesn't sound very appealing, and yet we're going to see that it's easy to implement, read on.