Basic Encryption & Hashing in Node.js
About two years ago, I came to be interested in K-pop musics. When I told my friend about it and told him that Girl’s Generation is my favorite, opposing my opinion he recommended me of one ladies idle group called Apink. I have not yet decided which is better out of Girl’s Generation and Apink—and have no intension to judge—however, among the songs from the Apink, the one I like the most is Hush. Although obviously that song has nothing to do with this article except it’s a “hash” with an “u” for the “a”, I am going to examine the simple use of encryption in Node.js, which is sometimes interchangeably called hash.
Encryption or Hash? Or Cipher? Wait, is it Salt?
First stuff, crypt and crypto are different terms. Oxford English Dictionary defines crypt as an underground room beneath a church. And Crypto is a shorthand for cryptography, or being concealed or secret. Therefore, when we talk about a cryptography, we can never say “crypt” unless you actually also want to talk about a chamber.
As a whole, both encryption and hash are types of crypto, but they surely play a different role and are for different purpose.
Encryption (Cipher)
Encryption is a reversible technique of cryptography, which meant that you can reverse an encrypted data back into a original text or data. Encrypted text is called cipher. Cipher can be also used as verb, so encryption can be also said as ciphering. If “encrypt” can be defined as “to convert a text into a one that nobody can understand with a cryptography,” hashing would be also a type of encryption. Yet, it seems that “encryption” is to be used only when you encrypt and decrypt, but not when meaning a hash. In order to avoid the confusion, probably we better use “cipher” over “encryption.”
The reason why you want to encrypt a message is to prevent someone reading your message during delivery. And the best example of encryption is Secure Sockets Layer, or also known as SSL for short. SSL enables the server and the browser communicate privately by encrypting the all date exchanged between them. Imagine you found nice shoes at the online store and decided to buy them. During the process of purchase you typically have to submit your credit card information. In order to avoid someone steals that information, you encrypt it. And the online store on their server decrypt your credit card information into its original numeral data so that they can verify payment. So far so good. Usually, emails are also exchanged in an encrypted way, TLS or SSL.
Ideally, you don’t want to submit any of your information at not-SSL websites, especially your credit card information because it might be stolen and abused.
A key is used and shared between sender and recipient on the process of encryption and decryption. Based on the number of key being used, encryption varies in two ways: symmetric and asymmetric. Symmetric encryption is a way in which only one key is used for encrypting and decrypting, and by both the server and the client. This one key is called pre-shared key, and usually available to anyone on the internet so that they can freely exchange their message or data in a secret way.
In asymmetric encryption, you use two different keys: one (public key) for encrypting used by sender and another(private key) for decrypting used by recipient. The private key can be only used by someone who decrypts a cipher.
In SSL, the example mentioned above, it uses a hybrid system of both symmetric and asymmetric encryption. To sum up, at first the server and the client generate temporary pre-shared key in a way of asymmetric encryption, and once that key is generated they communicate symmetrically. If you want to know more detail about symmetric and asymmetric encryption, please check out Symmetric vs. Asymmetric Encryption or Behind the Scenes of SSL Cryptography
In encryption, the character length of cipher is not fixed. The longer your plain text, the longer the cipher. If you’re given the same plain text, same key, and same algorithm, the ciphered text will be always the same. That’s what makes it reversible. The common algorithms used for encryption are AES and PGP.
Hash
Unlike encryption or ciphering, hash is irreversible. Which means that once you’ve hashed your texts, it’s so hard to turn it back into its original text…almost impossible. In short, it‘s a one-way process. The two hashed texts will be completely different even when the input texts are very similar—even if they are one-letter different.
The term “hash” is a bit more confusing than the case of encryption because even if you open the Oxford English Dictionary, you can‘t find an accurate definition that matches the meaning we use it in computer science. However, we use “hash” as a noun and a verb and we say “hashing” when we want to say of the process of hash.
Hash is widely used for storing your password in the database. Just in case that someone hacks into the database where your password or any other your informations are stored in and steals your log-in information, they hash your password. Since the hash is an one-way process, even the service provider wouldn’t know your password. Only you can know unless you tell somebody else. That is why when you lose your password, they have to reissue a new password for you. It’s not about they don’t want to tell you (or someone who is pretending being you) your password, it’s about they can’t.
Just as an encryption uses a key to generate an cipher, a hash uses a salt. A salt is a random text of letters and numbers which is added to the input text before hashing. Usually one salt is automatically generated as a random text for each one of input, and then added to your input during the process of hashing. So why do we need a salt?
Imagine that there exist widely commonly used passwords in this world…like birthdays. For instance, I was born on June 3rd 1990. So I sign up for one service with a password of “19900603.” Meanwhile hackers have a list of hashed birthday-passwords in their hands, cracking into the database of the service I signed up for and seeking a hashed password that matches to the one of their list. And once they find my password, they can log-in to my account. Ouch. In order to prevent that happening, you need a salt.
Another case the salt plays its role is when one user signs up multiple services with a same password. If you use a same password for, say Facebook and Twitter without a salt, someone who had managed to hack into their databases would know at least that you use the same password, and which more likely leads to a security vulnerability.
The result of hashing, which is a hash or sometimes called message digest, has a fixed character length. The output length varies based on what algorithm is being used. The common algorithms used for hash are SHA and MD5.
Prerequisite
Before proceeding, you must have Node.js and NPM installed on your computer. As of this writing, I use Node.js v6.11.0 and NPM v3.10.10. I didn’t check fully whether the codes below are compatible with the earlier version of Node.js, but I think they do. If they don’t, please figure out which function has changed.
Also, you may want to run the next commands.
$ mkdir crypto && cd crypto
$ touch crypto.js bcrypt.js
$ yarn add bcrypt
// Or
$ npm install bcrypt
Installing bcrypt module may require you to download Xcode in advance. For how to install bcrypt module, please refer to the bcrypt’s Github repository.
Encryption & Hash with Crypto
The crypto is a buit-in encryption API for Node.js. And I’m goint to use this for testing. First, please add the next line to the crypto.js.
const crypto = require('crypto');
Please keep adding the following codes to the crypto.js.
Encryption
const cipher = crypto.createCipher('aes192', 'a password');
let encrypted = cipher.update('some clear text data', 'utf8', 'hex');
console.log('encrypted: ' + encrypted);
encrypted += cipher.final('hex');
console.log('encrypted final: ' + encrypted);
Decryption
const decipher = crypto.createDecipher('aes192', 'a password');
let decrypted = decipher.update(encrypted, 'hex', 'utf8');
console.log('decrypted: ' + decrypted);
Making sure that you are at the working directory, execute $ node crypto.js command. The codes above output:
encrypted: ca981be48e90867604588e75d04feabb
encrypted final: ca981be48e90867604588e75d04feabb63cc007a8f8ad89b10616ed84d815504
decrypted: some clear text
decrypted final: some clear text data
Now if you replace “some clear text data” with “some clear text file,” the output of the encryption will be:
encrypted: ca981be48e90867604588e75d04feabb
encrypted final: ca981be48e90867604588e75d04feabb637ebc38b8ae443fc4b8e8375d293537
You see that until “…bb63” two encryption results are the same and only texts after that have been changed. Next we try hashing.
const hash = crypto.createHash('sha256');
hash.update('some data to hash');
console.log('digest: ' + hash.digest('hex'));
The output of this code will be:
digest: 6a2da20943931e9834fc12cfe5bb47bbd9ae43489a30726962b576f4e3993e50
Let’s try hashing one more time replacing the text to “some date to hash.” This time, the output will be like the following.
digest: 9fce7c79c3937f5fbca3b6972c84ab470ecc64a775666bef9777ea2083b05db1
Completely different despite the similarity of the two input texts.
If you want to know the algorithms available, add the next line and run $ node crypto.js.
const ciphers = crypto.getCiphers();
const hashes = crypto.getHashes();
console.log('cipers: ' + cipers);
console.log('hashes: ' + hashes);
Hash with bcrypt
The “b” stands for Blowfish cipher, and bcrypt is a hash function based on this cipher. There are many implementations of bcrypt for major computer languages such as PHP, Java, Python, and Ruby. I’m not sure I should write bcrypt with “b” in lowercase even when it is used as a specific noun. Nevertheless, it seems like that everyone is following the traditional “bcrypt,” so I decided to do the same in here.
First, let’s add the next lines of code at the top of bcrypt.js.
const bcrypt = require('bcrypt');
const saltRounds = 10;
const myPlaintextPassword = 's0/\/\P4$$w0rD';
const someOtherPlaintextPassword = 'not_bacon';
There are basically two ways to generate a hash. First one is to generate a salt and a hash on separate functions. Another is on one function. Please add the following codes to bcrypt.js.
Separate functions
bcrypt.genSalt(saltRounds, (err, salt) => {
console.log('genSalt salt: ' + salt);
bcrypt.hash(myPlaintextPassword, salt, (err, hash) => {
console.log('genSalt hash: ' + hash);
});
});
One function
bcrypt.hash(myPlaintextPassword, saltRounds, (err, hash) => {
console.log('auto-gen hash: ' + hash);
});
});
Now let’s run $ node bcrypt.js, and it outputs something similar to:
genSalt salt: $2a$10$UI5I4dexbIF2rWY5ZpD9fu
genSalt hash: $2a$10$UI5I4dexbIF2rWY5ZpD9fuSZHbhZUR.G/yiKX0SYGXOzXjvREREfy
auto-gen hash: $2a$10$zB.S0AXbsCWOloI5CPPPneV4uYam6SIVHfd5oJZbpXA0237AObTwW
Since the codes above are asynchronous, the result of auto-gen hash may be displayed before the result of genSalt hash.
In a hashing with a salt, the output will be always different even if your input text is the same because the every time you hash, the salt is automatically and randomly generated. The prefix $2a$ indicates that the string is a bcrypt hash, and the “10” after the second $ mark is a rounds value being used in hashing. For example if you change the value of saltRounds to 12, the output will begin with “$2a$12.” This rounds value is also known as cost. With bcrypt Node.js module, the resultant hashes will be 60 characters long.
The bcrypt comes with a function to check password.
bcrypt.hash(myPlaintextPassword, saltRounds, (err, hash) => {
// Comparing two hash values.
bcrypt.compare(myPlaintextPassword, hash, (err, res) => {
// res == true
});
bcrypt.compare(someOtherPlaintextPassword, hash, (err, res) => {
// res == false
});
});
In practice, the myPlaintextPassword or someOtherPlaintextPassword are user’s input, and the hash is loaded from the database. For more detail of or how to use bcrypt, please check out its NPM page.
bcrypt-nodejs
There is a revision of node.bcrypt.js that is called bcrypt-nodejs, distributed by different developers—please, please, please, no more naming confusion. The biggest difference between node.bcrypt.js and bcrypt-node.js is that bcrypt-nodejs doesn’t require you to install Xcode or other SDKs to use it. It’s a native Javascript module. I didn’t take a look closely, but the usage is almost same, except that bcrypt-nodejs’ .hash() doesn’t accept the saltRounds value but salt.
Comparing the numbers of downloads of two modules, node.bcrypt.js is surpassing bcrypt-nodejs, roughly by 300 percent. Hence, it seems that still node.bcrypt.js is being widely used.
Conclusion
I showed the basic usage of the encryption and hash in Node.js. Next step is to implement those code in the application, for instance hash for sign-up process. Although it requires you the different level of understanding for production, I believe that the basic knowledge is crucial to handle those situation.
- Encryption: reversible, for exchanging credit card informations or such that’s needed to be shared between two participants
- Hash: irreversible, for storing passwords or such that must be remembered by only one participant.
General speaking, as long as the system doesn’t matter which one to be used, hash must be used over encryption because it’s more secure.