π‘ Note: we wrote this example/tutorial
to understand how to do field-level encryption
from first principals.
Once we solved the problem,
we built a library to streamline it:
fields
.
We still recommend going through this example,
but if you just want
to get on with building your Phoenix
App,
use
fields
.
- Phoenix Ecto Encryption Example
- Why?
- What?
- Who?
- How?
- Conclusion
- Useful Links, FAQ & Background Reading
- Stuck / Need Help?
- Credits
Encrypting User/Personal data stored by your Web App is essential for security/privacy.
If your app offers any personalised content or interaction that depends on "login", it is storing personal data (by definition). You might be tempted to think that data is "safe" in a database, but it's not. There is an entire ("dark") army/industry of people (cybercriminals) who target websites/apps attempting to "steal" data by compromising databases. All the time you spend building your app, they spend trying to "break" apps like yours. Don't let the people using your app be the victims of identity theft, protect their personal data! (it's both the "right" thing to do and the law ...)
This example/tutorial is intended as a comprehensive answer to the question:
We are not "re-inventing encryption"
or using our "own algorithm"
everyone knows that's a "bad idea":
https://security.stackexchange.com/questions/18197/why-shouldnt-we-roll-our-own
We are following a battle-tested industry-standard approach
and applying it to our Elixir/Phoenix App.
We are using:
- Advanced Encryption Standard (AES) to encrypt sensitive data.
- Galois/Counter Mode for symmetric key cryptographic block ciphers: https://en.wikipedia.org/wiki/Galois/Counter_Mode recommended by many security and cryptography practitioners including Matthew Green, Niels Ferguson and Bruce Schneier
- "Under the hood" we are using Erlang's
crypto library
specifically AES with 256 bit keys
(the same as AWS or Google's KMS service) see: https://erlang.org/doc/man/crypto.html#block_encrypt-4
- Password "hashing" using the Argon2
key derivation function (KDF): https://en.wikipedia.org/wiki/Argon2
specifically the Elixir implementation ofargon2
written by David Whitlock: /~https://github.com/riverrun/argon2_elixir which in turn uses the C "reference implementation" as a "Git Submodule".
Β―\_(γ)_/Β―...?
Don't be "put off" if any of these terms/algorithms are unfamiliar to you;
this example is "step-by-step" and we are happy to answer/clarify any (relevant and specific) questions you have!
This example/tutorial follows the Open Web Application Security Project (OWASP) Cryptographic and Password rules:
- Use "strong approved Authenticated Encryption"
based on an AES algorithm.
- Use GCM mode of operation for symmetric key cryptographic block ciphers.
- Keys used for encryption must be rotated at least annually.
- Only use approved public algorithm SHA-256 or better for hashing.
- Argon2 is the winner of the password hashing competition and should be your first choice for new applications.
See:
- https://www.owasp.org/index.php/Cryptographic_Storage_Cheat_Sheet
- https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet
This example/tutorial is for any developer
(or technical decision maker / "application architect")
who takes personal data protection seriously
and wants a robust/reliable and "transparent" way
of encrypting data before
storing it,
and decrypting when it is queried.
- Basic
Elixir
syntax knowledge: /~https://github.com/dwyl/learn-elixir - Familiarity with the
Phoenix
framework: /~https://github.com/dwyl/learn-phoenix-framework - Basic understanding of
Ecto
(the module used to interface with databases in elixir/phoenix)
If you are totally
new
to (or "rusty" on) Elixir, Phoenix or Ecto, we recommend going through our Phoenix Chat Example (Beginner's Tutorial) first: /~https://github.com/dwyl/phoenix-chat-example
You will not need any "advanced" mathematical knowledge;
we are not "inventing" our own encryption or
going into the "internals" of any cyphers/algorithms/schemes.
You do not need to understand
how the encryption/hashing algorithms work,
but it is useful to know the difference between
encryption
vs.
hashing
and
plaintext
vs.
ciphertext.
The fact that the example/tutorial follows all OWASP crypto/hashing rules
(see:
"OWASP Cryptographic Rules?"
section above),
should be "enough" for most people who just want to focus
on building their app and don't want to
"go down the rabbit hole".
However ... We have included 30+ links in the "Useful Links" section at the end of this readme. The list includes several common questions (and answers) so if you are curious, you can learn.
Note: in the @dwyl Library we have https://www.schneier.com/books/applied_cryptography So, if you're really curious let us know!
Simply reading ("skimming") through this example will
only take 15 minutes.
Following the examples on your computer (to fully understand it)
will take around 1 hour
(including reading a few of the links).
Invest the time up-front to avoid on the embarrassment and fines of a data breach.
These are "step-by-step" instructions, don't skip any step(s).
In your Terminal,
create a new
Phoenix application called "encryption":
mix phx.new encryption
When you see Fetch and install dependencies? [Yn]
,
type y
and press the [Enter]
key
to download and install the dependencies.
You should see following in your terminal:
* running mix deps.get
* running mix deps.compile
* running cd assets && npm install && node node_modules/webpack/bin/webpack.js --mode development
We are almost there! The following steps are missing:
$ cd encryption
Then configure your database in config/dev.exs and run:
$ mix ecto.create
Start your Phoenix app with:
$ mix phx.server
You can also run your app inside IEx (Interactive Elixir) as:
$ iex -S mix phx.server
Follow the first instruction
change into the encryption
directory:
cd encryption
Next create the database for the App using the command:
mix ecto.create
You should see the following output:
Compiling 13 files (.ex)
Generated encryption app
The database for Encryption.Repo has been created
In our example user
database table,
we are going to store 3 (primary) pieces of data.
name
: the person's name (encrypted)email
: their email address (encrypted)password_hash
: the hashed password (so the person can login)
In addition to the 3 "primary" fields, we need one more field to store "metadata":
email_hash
: so we can check ("lookup") if an email address is in the database without having to decrypt the email(s) stored in the DB.
Create the user
schema using the following generator command:
mix phx.gen.schema User users email:binary email_hash:binary name:binary password_hash:binary
The reason we are creating the encrypted/hashed fields as :binary
is that the data stored in them will be encrypted
and :binary
is the most efficient Ecto/SQL data type
for storing encrypted data;
storing it as a String
would take up more bytes
for the same data.
i.e. wasteful without any benefit to security or performance.
see:
https://dba.stackexchange.com/questions/56934/what-is-the-best-way-to-store-a-lot-of-user-encrypted-data
and: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html
Next we need to update our newly created migration file. Open
priv/repo/migrations/{timestamp}_create_users.exs
.
Your migration file will have a slightly different name to ours as migration files are named with a timestamp when they are created but it will be in the same location.
Update the file from:
defmodule Encryption.Repo.Migrations.CreateUsers do
use Ecto.Migration
def change do
create table(:users) do
add(:email, :binary)
add(:email_hash, :binary)
add(:name, :binary)
add(:password_hash, :binary)
timestamps()
end
end
end
To
defmodule Encryption.Repo.Migrations.CreateUsers do
use Ecto.Migration
def change do
create table(:users) do
add(:email, :binary)
add(:email_hash, :binary)
add(:name, :binary)
add(:password_hash, :binary)
timestamps()
end
create(unique_index(:users, [:email_hash]))
end
end
The newly added line ensures that we will never be allowed to enter duplicate
email_hash
values into our database.
Run the "migration" task to create the tables in the Database:
mix ecto.migrate
Running the mix ecto.migrate
command will create the
users
table in your encryption_dev
database.
You can view this (empty) table in a PostgreSQL GUI. Here is a screenshot
from pgAdmin:
We need 6 functions for encrypting, decrypting, hashing and verifying the data we will be storing:
- Encrypt - to encrypt any personal data we want to store in the database.
- Decrypt - decrypt any data that needs to be viewed.
- Get Key - get the latest encryption/decryption key (or a specific older key where data was encrypted with a different key)
- Hash Email (deterministic & fast) - so that we can "lookup" an email without "decrypting". The hash of an email address should always be the same.
- Hash Password (pseudorandom & slow) - the output of the hash should always be different and relatively slow to compute.
- Verify Password - check a password against the stored
password_hash
to confirm that the person "logging-in" has the correct password.
The next 6 sections of the example/tutorial will walk through the creation of (and testing) these functions.
Note: If you have any questions on these functions, please ask:
github.com/dwyl/phoenix-ecto-encryption-example/issues
Create a file called lib/encryption/aes.ex
and copy-paste (or hand-write)
the following code:
defmodule Encryption.AES do
@aad "AES256GCM" # Use AES 256 Bit Keys for Encryption.
def encrypt(plaintext) do
iv = :crypto.strong_rand_bytes(16) # create random Initialisation Vector
key = get_key() # get the *latest* key in the list of encryption keys
{ciphertext, tag} =
:crypto.crypto_one_time_aead(:aes_256_gcm, key, iv, to_string(plaintext), @aad, true)
iv <> tag <> ciphertext # "return" iv with the cipher tag & ciphertext
end
defp get_key do # this is a "dummy function" we will update it in step 3.3
<<109, 182, 30, 109, 203, 207, 35, 144, 228, 164, 106, 244, 38, 242,
106, 19, 58, 59, 238, 69, 2, 20, 34, 252, 122, 232, 110, 145, 54,
241, 65, 16>> # return a random 32 Byte / 128 bit binary to use as key.
end
end
The encrypt/1
function for encrypting plaintext
into ciphertext
is quite simple; (the "body" is only 4 lines).
Let's "step through" these lines one at a time:
encrypt/1
accepts one argument; theplaintext
to be encrypted.- First we create a "strong" random
initialization vector
(IV) of 16 bytes (128 bits)
using the Erlang's crypto library
strong_rand_bytes
function: https://erlang.org/doc/man/crypto.html#strong_rand_bytes-1 The "IV" ensures that each time a string/block of text/data is encrypted, theciphertext
is different.
Having different
ciphertext
each timeplaintext
is encrypted is essential for "semantic security" whereby repeated use of the same encryption key and algorithm does not allow an "attacker" to infer relationships between segments of the encrypted message. Cryptanalysis techniques are well "beyond scope" for this example/tutorial, but we highly encourage to check-out the "Background Reading" links at the end and read up on the subject for deeper understanding.
-
Next we use the
get_key/0
function to retrieve the latest encryption key so we can use it toencrypt
theplaintext
(the "real"get_key/0
is defined below in section 3.3). -
Then we use the Erlang
block_encrypt
function to encrypt theplaintext
.
Using:aes_gcm
("Advanced Encryption Standard Galois Counter Mode"):@aad
is a "module attribute" (Elixir's equivalent of a "constant") is defined inaes.ex
as@aad "AES256GCM"
this simply defines the encryption mode we are using which, if you break down the code into 3 parts:- AES = Advanced Encryption Standard.
- 256 = "256 Bit Key"
- GCM = "Galois Counter Mode"
-
Finally we "return" the
iv
with theciphertag
&ciphertext
, this is what we store in the database. Including the IV and ciphertag is essential for allowing decryption, without these two pieces of data, we would not be able to "reverse" the process.
Note: in addition to this
encrypt/1
function, we have defined anencrypt/2
"sister" function which accepts a specific (encryption)key_id
so that we can use the desired encryption key for encrypting a block of text. For the purposes of this example/tutorial, it's not strictly necessary, but it is included for "completeness".
Create a file called test/lib/aes_test.exs
and copy-paste
the following code into it:
defmodule Encryption.AESTest do
use ExUnit.Case
alias Encryption.AES
test ".encrypt includes the random IV in the value" do
<<iv::binary-16, ciphertext::binary>> = AES.encrypt("hello")
assert String.length(iv) != 0
assert String.length(ciphertext) != 0
assert is_binary(ciphertext)
end
test ".encrypt does not produce the same ciphertext twice" do
assert AES.encrypt("hello") != AES.encrypt("hello")
end
end
Run these two tests by running the following command:
mix test test/lib/aes_test.exs
The full function definitions for AES
encrypt/1
&encrypt/2
are in:lib/encryption/aes.ex
And tests are in:test/lib/aes_test.exs
The decrypt
function reverses the work done by encrypt
;
it accepts a "blob" of ciphertext
(which as you may recall),
has the IV and cypher tag prepended to it, and returns the original plaintext
.
In the lib/encryption/aes.ex
file, copy-paste (or hand-write)
the following decrypt/1
function definition:
def decrypt(ciphertext) do
<<iv::binary-16, tag::binary-16, ciphertext::binary>> =
ciphertext
:crypto.crypto_one_time_aead(:aes_256_gcm, get_key(), iv, ciphertext, @aad, tag, false)
end
The fist step (line) is to "split" the IV from the ciphertext
using Elixir's binary pattern matching.
If you are unfamiliar with Elixir binary pattern matching syntax:
<<iv::binary-16, tag::binary-16, ciphertext::binary>>
read the following guide: https://elixir-lang.org/getting-started/binaries-strings-and-char-lists.html
The
:crypto.crypto_one_time_aead(:aes_256_gcm, get_key(key_id), iv, ciphertext, @aad, tag, false)
line is the very similar to the encrypt
function.
The ciphertext
is decrypted using
block_decrypt/4
passing in the following parameters:
:aes_256_gcm
= encyrption algorithmget_key(key_id)
= get the encryption key used toencrypt
theplaintext
iv
= the original Initialisation Vector used toencrypt
theplaintext
{@aad, ciphertext, tag}
= a Tuple with the encryption "mode",ciphertext
and thetag
that was originally used to encrypt theciphertext
.
Finally return just the original plaintext
.
Note: as above with the
encrypt/2
function, we have defined andecrypt/2
"sister" function which accepts a specific (encryption)key_id
so that we can use the desired encryption key for decrypting theciphertext
. For the purposes of this example/tutorial, it's not strictly necessary, but it is included for "completeness".
In the test/lib/aes_test.exs
add the following test:
test "decrypt/1 ciphertext that was encrypted with default key" do
plaintext = "hello" |> AES.encrypt |> AES.decrypt()
assert plaintext == "hello"
end
Re-run the tests mix test test/lib/aes_test.exs
and confirm they pass.
The full
encrypt
&decrypt
function definitions with@doc
comments are in:lib/encryption/aes.ex
> And tests are in: [`test/lib/aes_test.exs`](/~https://github.com/dwyl/phoenix-ecto-encryption-example/blob/master/test/lib/aes_test.exs)
Key rotation is a "best practice" that limits the amount of data an "attacker" can decrypt if the database were ever "compromised" (provided we keep the encryption keys safe that is!) A really good guide to this is: https://cloud.google.com/kms/docs/key-rotation.
For this reason we want to 'store' a key_id
. The key_id
indicates which encryption key was used to encrypt the data. Besides the IV and ciphertag, the key_id is also essential for allowing decryption, so we change the encrypt/1 function to preserve the key_id as well
defmodule Encryption.AES do
@aad "AES256GCM" # Use AES 256 Bit Keys for Encryption.
def encrypt(plaintext) do
iv = :crypto.strong_rand_bytes(16)
# get latest key
key = get_key()
# get latest ID;
key_id = get_key_id()
# {ciphertext, tag} = :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, plaintext, 16})
{ciphertext, tag} = :crypto.block_encrypt(:aes_gcm, key, iv, {@aad, to_string(plaintext), 16})
iv <> tag <> <<key_id::unsigned-big-integer-32>> <> ciphertext
end
defp get_key do
get_key_id() |> get_key
end
defp get_key(key_id) do
encryption_keys() |> Enum.at(key_id)
end
defp get_key_id do
Enum.count(encryption_keys()) - 1
end
defp encryption_keys do
Application.get_env(:encryption, Encryption.AES)[:keys]
end
end
For the complete file containing these functions see:
lib/encryption/aes.ex
For this example/demo we are using two encryption keys which are kept as an application environment variable. The values of the encryptions keys are associated with the key Encryption.AES. During the encryption we are by default always using the latest (most recent) encryption key (get_key/0) and the corresponding key_id is fetched by get_key_id/0 which becomes part of the ciphertext.
With decrypting we now pattern match the associated key_id from the ciphertext in order to be able to decrypt with the correct encryption key.
def decrypt(ciphertext) do
<<iv::binary-16, tag::binary-16, key_id::unsigned-big-integer-32, ciphertext::binary>> =
ciphertext
:crypto.block_decrypt(:aes_gcm, get_key(key_id), iv, {@aad, ciphertext, tag})
end
So we defined the get_key
twice in lib/encryption/aes.ex
as per Erlang/Elixir standard,
once for each "arity"
or number of "arguments".
In the first case get_key/0
assumes you want the latest Encryption Key.
The second case get_key/1
lets you supply the key_id
to be "looked up":
Both versions of get_key
use encryption_keys/0 function to call the Application.get_env
function:
Application.get_env(:encryption, Encryption.AES)[:keys]
specifically.
For this to work we need to define the keys as an Environment Variable
and make it available to our App in config.exs
.
In order for our get_key/0
and get_key/1
functions to work,
it needs to be able to "read" the encryption keys.
We need to "export" an Environment Variable containing a (comma-separated) list of (one or more) encryption key(s).
Copy-paste (and run) the following command in your terminal:
echo "export ENCRYPTION_KEYS='nMdayQpR0aoasLaq1g94FLba+A+wB44JLko47sVQXMg=,L+ZVX8iheoqgqb22mUpATmMDsvVGtafoAeb0KN5uWf0='" >> .env && echo ".env" >> .gitignore
For now, copy paste this command exactly as it is.
When you are deploying your own App, generate your own AES encryption key(s) see: How To Generate AES Encryption Keys? section below for how to do this.
Note: there are two encryption keys separated by a comma. This is to demonstrate that it's possible to use multiple keys.
We prefer to store our Encryption Keys as Environment Variables this is consistent with the "12 Factor App" best practice: https://en.wikipedia.org/wiki/Twelve-Factor_App_methodology
Update the config/config.exs
to load the environment variables from the .env
file into the application. Add the following code your config file just above
import_config "#{Mix.env()}.exs"
:
# run shell command to "source .env" to load the environment variables.
try do # wrap in "try do"
File.stream!("./.env") # in case .env file does not exist.
|> Stream.map(&String.trim_trailing/1) # remove excess whitespace
|> Enum.each(fn line -> line # loop through each line
|> String.replace("export ", "") # remove "export" from line
|> String.split("=", parts: 2) # split on *first* "=" (equals sign)
|> Enum.reduce(fn(value, key) -> # stackoverflow.com/q/33055834/1148249
System.put_env(key, value) # set each environment variable
end)
end)
rescue
_ -> IO.puts "no .env file found!"
end
# Set the Encryption Keys as an "Application Variable" accessible in aes.ex
config :encryption, Encryption.AES,
keys: System.get_env("ENCRYPTION_KEYS") # get the ENCRYPTION_KEYS env variable
|> String.replace("'", "") # remove single-quotes around key list in .env
|> String.split(",") # split the CSV list of keys
|> Enum.map(fn key -> :base64.decode(key) end) # decode the key.
Given that get_key/0
and get_key/1
are both defp
(i.e. "private")
they are not "exported" with the AES module and therefore cannot be invoked
outside of the AES module.
The get_key/0
and get_key/1
are invoked by encrypt/1
and decrypt/1
and thus provided these (public) latter functions
are tested adequately, the "private" functions will be too.
Re-run the tests mix test test/lib/aes_test.exs
and confirm they still pass.
We also define a test in order to verify the working of key rotation. We add a new encryption key and assert (and make sure) that an encrypted value with an older encryption key will still be decrypted correctly.
test "can still decrypt the value after adding a new encryption key" do
encrypted_value = "hello" |> AES.encrypt()
original_keys = Application.get_env(:encryption, Encryption.AES)[:keys]
# add a new key
Application.put_env(:encryption, Encryption.AES,
keys: original_keys ++ [:crypto.strong_rand_bytes(32)]
)
assert "hello" == encrypted_value |> AES.decrypt()
# rollback to the original keys
Application.put_env(:encryption, Encryption.AES, keys: original_keys)
end
The full
encrypt
&decrypt
function definitions with@doc
comments are in:lib/encryption/aes.ex
And tests are in:test/lib/aes_test.exs
The idea behind hashing email addresses is to allow us to perform a lookup (in the database) to check if the email has already been registered/used for app/system.
Imagine that alex@example.com
has previously used your app.
The SHA256
hash (encoded as base64)
is: "bbYebcvPI5DkpGr0JvJqEzo77kUCFCL8euhukTbxQRA="
try
it for yourself in iex
:
iex(1)> email = "alex@example.com"
"alex@example.com"
iex(2)> email_hash = :crypto.hash(:sha256, email) |> Base.encode64
"bbYebcvPI5DkpGr0JvJqEzo77kUCFCL8euhukTbxQRA="
If we store the email_hash
in the database,
when Alex
wants to log-in to the App/System,
we simply perform a "lookup" in the users
table:
hash = :crypto.hash(:sha256, email) |> Base.encode64
query = "SELECT * FROM users WHERE email_hash = $1"
user = Ecto.Adapters.SQL.query!(Encryption.Repo, query, [hash])
Note: there's a "built-in" Ecto
get_by
function to perform this type of
"SELECT ... WHERE field = value"
query effortlessly
All Phoenix apps have a secret_key_base
for sessions.
see: https://hexdocs.pm/plug/1.13.6/Plug.Session.COOKIE.html
Run the following command to generate a new phoenix secret key:
mix phx.gen.secret
copy-paste the output (64bit String
)
into your .env
file after the "equals sign" on the line for SECRET_KEY_BASE
:
export SECRET_KEY_BASE={YourSecreteKeyBaseGeneratedUsing-mix_phx.gen.secret}
Your .env
file should look similar to:
.env_sample
Load the secret key into your environment by typing into your terminal:
source .env
Note: We are using an
.env
file, but if you are using a "Cloud Platform" to deploy your app,
you could consider using their "Key Management Service" for managing encryption keys. eg:
- Heroku: /~https://github.com/dwyl/learn-environment-variables#environment-variables-on-heroku
- AWS: https://aws.amazon.com/kms/
- Google Cloud: https://cloud.google.com/kms/
We now need to update our config files again. Open your config.exs
file and
change the the following:
from
secret_key_base: "3PXN/6k6qoxqQjWFskGew4r74yp7oJ1UNF6wjvJSHjC5Y5LLIrDpWxrJ84UBphJn",
# your secret_key_base will be different but that is fine.
To
secret_key_base: System.get_env("SECRET_KEY_BASE"),
As mentioned above, all Phoenix applications come with a secret_key_base
.
Instead of using this default one, we have told our application to use the new
one that we added to our .env
file.
Now we need to edit our config/test.exs
file. Change the following:
from
config :encryption, EncryptionWeb.Endpoint,
http: [port: 4001],
server: false
To
config :encryption, EncryptionWeb.Endpoint,
http: [port: 4001],
server: false,
secret_key_base: System.get_env("SECRET_KEY_BASE")
By adding the previous code block we will now have a secret_key_base
which
we will be able to use for testing.
When we first created the Ecto Schema for our "user", in
Step 2
(above)
This created the
lib/encryption/user.ex
file with the following schema:
schema "users" do
field :email, :binary
field :email_hash, :binary
field :name, :binary
field :password_hash, :binary
timestamps()
end
The default Ecto field types (:binary
) are a good start.
But we can do so much better if we define custom Ecto Types!
Ecto Custom Types are a way of automatically "pre-processing" data before inserting it into (and reading from) a database. Examples of "pre-processing" include:
- Custom Validation e.g: phone number or address format.
- Encrypting / Decrypting
- Hashing
A custom type expects 6 callback functions to be implemented in the file:
type/0
- define the Ecto Type we want Ecto to use to store the data for our Custom Type. e.g::integer
or:binary
cast/1
- "typecasts" (converts) the given data to the desired type e.g: Integer to String.dump/1
- performs the "processing" on the raw data before it get's "dumped" into the Ecto Native Type.load/1
- called when loading data from the database and receive an Ecto native type.embed_as/1
- the return value (:self
or:dump
) determines how the type is treated inside embeds (not used here).equal?/2
- invoked to determine if changing a type's field value changes the corresponding database record.
Create a file called lib/encryption/hash_field.ex
and add the following:
defmodule Encryption.HashField do
@behaviour Ecto.Type
def type, do: :binary
def cast(value) do
{:ok, to_string(value)}
end
def dump(value) do
{:ok, hash(value)}
end
def load(value) do
{:ok, value}
end
def embed_as(_), do: :self
def equal?(value1, value2), do: value1 == value2
def hash(value) do
:crypto.hash(:sha256, value <> get_salt(value))
end
# Get/use Phoenix secret_key_base as "salt" for one-way hashing Email address
# use the *value* to create a *unique* "salt" for each value that is hashed:
defp get_salt(value) do
secret_key_base =
Application.get_env(:encryption, EncryptionWeb.Endpoint)[:secret_key_base]
:crypto.hash(:sha256, value <> secret_key_base)
end
end
Let's step through each of these
The best data type for storing encrypted data is :binary
(it uses half the "space" of a :string
for the same ciphertext).
Cast any data type to_string
before encrypting it.
(the encrypted data "ciphertext" will be of :binary
type)
The hash/1
function use Erlang's crypto
library
hash/2
function.
- First we tell the
hash/2
function that we want to use:sha256
"SHA 256" is the most widely used/recommended hash; it's both fast and "secure". - We then hash the
value
passed in to thehash/1
function (we defined) and concatenate it with "salt" using theget_salt/1
function which retrieves thesecret_key_base
environment variable and computes a unique "salt" using the value.
We use the SHA256
one-way hash for speed.
We "salt" the email address so that
the hash has some level of "obfuscation",
in case the DB is ever "compromised"
the "attacker" still has to "compute"
a "rainbow table" from scratch.
Return the hash value as it is read from the database.
This callback is only of importance when the type is part of an embed. It's not used here,
but required for modules adopting the Ecto.Type
behaviour as of Ecto 3.2.
This callback is invoked when we cast changes into a changeset and want to
determine whether the database record needs to be updated. We use a simple
equality comparison (==
) to compare the current value to the requested
update. If both values are equal, there's no need to update the record.
Note: Don't forget to export your
SECRET_KEY_BASE
environment variable (see instructions above)
The full file containing these two functions is:
lib/encryption/hash_field.ex
And the tests for the functions are:test/lib/hash_field_test.exs
First add the alias
for HashField
near the top
of the lib/encryption/user.ex
file. e.g:
alias Encryption.HashField
Next, in the lib/encryption/user.ex
file,
update the lines for email_hash
in the users schema
from:
schema "users" do
field :email, :binary
field :email_hash, :binary
field :name, :binary
field :password_hash, :binary
timestamps()
end
To:
schema "users" do
field :email, :binary
field :email_hash, HashField
field :name, :binary
field :password_hash, :binary
timestamps()
end
def changeset(%User{} = user, attrs \\ %{}) do
user
|> cast(attrs, [:name, :email])
|> validate_required([:email])
|> add_email_hash
|> unique_constraint(:email_hash)
end
defp add_email_hash(changeset) do
if Map.has_key?(changeset.changes, :email) do
changeset |> put_change(:email_hash, changeset.changes.email)
else
changeset
end
end
We should test this new functionality. Create the file
test/lib/user_test.exs
and add the following:
defmodule Encryption.UserTest do
use Encryption.DataCase
alias Encryption.User
@valid_attrs %{
name: "Max",
email: "max@example.com",
password: "NoCarbsBeforeMarbs"
}
@invalid_attrs %{}
describe "Verify correct working of hashing" do
setup do
user = Repo.insert!(User.changeset(%User{}, @valid_attrs))
{:ok, user: user, email: @valid_attrs.email}
end
test "inserting a user sets the :email_hash field", %{user: user} do
assert user.email_hash == user.email
end
test ":email_hash field is the encrypted hash of the email", %{user: user} do
user_from_db = User |> Repo.one()
assert user_from_db.email_hash == Encryption.HashField.hash(user.email)
end
end
end
For the full user tests please see:
test/user/user_test.exs
When hashing passwords, we want to use the strongest hashing algorithm
and we also want the hashed value (or "digest") to be different
each time the same plaintext
is hashed
(unlike when hashing the email address
where we want a deterministic digest).
Using argon2
makes "cracking" a password
(in the event of the database being "compromised")
far less likely as it uses both a CPU-bound "work-factor"
and a "Memory-hard" algorithm which will significantly
"slow down" the attacker.
In order to use argon2
we must add it to our mix.exs
file:
in the defp deps do
(dependencies) section, add the following line:
{:argon2_elixir, "~> 1.3"}, # securely hashing & verifying passwords
You will need to run mix deps.get
to install the dependency.
Create a file called lib/encryption/password_field.ex
in your project.
The first function we need is hash_password/1
:
defmodule Encryption.PasswordField do
def hash_password(value) do
Argon2.Base.hash_password(to_string(value),
Argon2.Base.gen_salt(), [{:argon2_type, 2}])
end
end
hash_password/1
accepts a password to be hashed and invokes
Argon2.Base.hash_password/3
passing in 3 arguments:
value
- the value (password) to be hashed.Argon2.Base.gen_salt/1
- the salt used to initialise the hash function note: "behind the scenes" just:crypto.strong_rand_bytes(16)
as we saw before in theencrypt
function; again, 128 bits is considered "secure" as a hash salt or initialization vector.[{:argon2_type, 2}]
- this corresponds toargon2id
see:
In order to test the PasswordField.hash_password/1
function
we use the Argon2.verify_pass
function to verify a password hash.
Create a file called test/lib/password_field_test.exs
and copy-paste (or hand-type) the following test:
defmodule Encryption.PasswordFieldTest do
use ExUnit.Case
alias Encryption.PasswordField, as: Field
test ".verify_password checks the password against the Argon2id Hash" do
password = "EverythingisAwesome"
hash = Field.hash_password(password)
verified = Argon2.verify_pass(password, hash)
assert verified
end
end
Run the test using the command:
mix test test/lib/password_field_test.exs
The test should pass; if not, please re-trace the steps.
The corresponding function to check (or "verify")
the password is verify_password/2
.
We need to supply both the password
and stored_hash
(the hash that was previously stored in the database
when the person registered or updated their password)
It then runs Argon2.verify_pass
which does the checking.
def verify_password(password, stored_hash) do
Argon2.verify_pass(password, stored_hash)
end
hash_password/1
and verify_password/2
functions are defined in:
lib/encryption/password_field.ex
To test that our verify_password/2
function works as expected,
open the file: test/lib/password_field_test.exs
and add the following code to it:
test ".verify_password fails if password does NOT match hash" do
password = "EverythingisAwesome"
hash = Field.hash_password(password)
verified = Field.verify_password("LordBusiness", hash)
assert !verified
end
Run the tests: mix test test/lib/password_field_test.exs
and confirm they pass.
If you get stuck, see:
test/lib/password_field_test.exs
Define the other Ecto.Type behaviour functions:
defmodule Encryption.PasswordField do
@behaviour Ecto.Type
def type, do: :binary
def cast(value) do
{:ok, to_string(value)}
end
def dump(value) do
{:ok, hash_password(value)}
end
def load(value) do
{:ok, value}
end
def embed_as(_), do: :self
def equal?(value1, value2), do: value1 == value2
def hash_password(value) do
Argon2.Base.hash_password(to_string(value),
Argon2.Base.gen_salt(), [{:argon2_type, 2}])
end
def verify_password(password, stored_hash) do
Argon2.verify_pass(password, stored_hash)
end
end
alias Encryption.{HashField, PasswordField, User}
Update the lines for :email
and :name
in the schema
from:
schema "users" do
field :email, :binary
field :email_hash, HashField
field :name, :binary
field :password_hash, :binary
timestamps()
end
To:
schema "users" do
field :email, :binary
field :email_hash, HashField
field :name, :binary
field :password_hash, PasswordField
timestamps()
end
Create a file called lib/encryption/encrypted_field.ex
and add the following:
defmodule Encryption.EncryptedField do
alias Encryption.AES # alias our AES encrypt & decrypt functions (3.1 & 3.2)
@behaviour Ecto.Type # Check this module conforms to Ecto.type behavior.
def type, do: :binary # :binary is the data type ecto uses internally
# cast/1 simply calls to_string on the value and returns a "success" tuple
def cast(value) do
{:ok, to_string(value)}
end
# dump/1 is called when the field value is about to be written to the database
def dump(value) do
ciphertext = value |> to_string |> AES.encrypt
{:ok, ciphertext} # ciphertext is :binary data
end
# load/1 is called when the field is loaded from the database
def load(value) do
{:ok, AES.decrypt(value)} # decrypted data is :string type.
end
# embed_as/1 dictates how the type behaves when embedded (:self or :dump)
def embed_as(_), do: :self # preserve the type's higher level representation
# equal?/2 is called to determine if two field values are semantically equal
def equal?(value1, value2), do: value1 == value2
end
Let's step through each of these
The best data type for storing encrypted data is :binary
(it uses half the "space" of a :string
for the same ciphertext).
Cast any data type to_string
before encrypting it.
(the encrypted data "ciphertext" will be of :binary
type)
Calls the AES.encrypt/1
function we defined in section 3.1 (above)
so data is encrypted 'automatically' before we insert into the database.
Calls the AES.decrypt/1
function so data is 'automatically' decrypted when it is read
from the database.
Note: the
load/2
function is not required for Ecto Type compliance. Further reading: https://hexdocs.pm/ecto/Ecto.Type.html
This callback is only of importance when the type is part of an
embed.
It's not used here,
but required for modules adopting the Ecto.Type
behaviour as of Ecto 3.2.
This callback is invoked when we cast changes into a changeset and want to
determine whether the database record needs to be updated. We use a simple
equality comparison (==
) to compare the current value to the requested
update. If both values are equal, there's no need to update the record.
Your encrypted_field.ex
Custom Ecto Type should look like this:
lib/encryption/encrypted_field.ex
try
to write the tests for the callback functions,
if you get "stuck", take a look at:
test/lib/encrypted_field_test.exs
Now that we have defined a Custom Ecto Type EncryptedField
,
we can use the Type in our User Schema.
Add the following line to "alias" the Type and a User
in the lib/encryption/user.ex
file:
alias Encryption.{HashField, PasswordField, EncryptedField, User}
Update the lines for :email
and :name
in the schema
from:
schema "users" do
field :email, :binary
field :email_hash, HashField
field :name, :binary
field :password_hash, PasswordField
timestamps()
end
To:
schema "users" do
field :email, EncryptedField
field :email_hash, HashField
field :name, EncryptedField
field :password_hash, PasswordField
timestamps()
end
Typically we will create git commit
(if we don't already have one)
for the "known state" where the tests were passing
(before starting the refactor).
The commit before refactoring the example is: /~https://github.com/dwyl/phoenix-ecto-encryption-example/tree/3659399ec32ca4f07f45d0552b9cf25c359a2456
The corresponding Travis-CI build for this commit is: https://travis-ci.org/dwyl/phoenix-ecto-encryption-example/jobs/379887597#L833
Note: if you are
new
to Travis-CI see: /~https://github.com/dwyl/learn-travis
We have gone through how to create custom Ecto Types in order to define our own functions for handling (transforming) specific types of data.
Our hope is that you have understood the flow.
We plan to extend this tutorial include User Interface please "star" the repo if you would find that useful.
Encryption keys should be the appropriate length (in bits) as required by the chosen algorithm.
An AES 128-bit key can be expressed as a hexadecimal string with 32 characters.
It will require 24 characters in base64.
An AES 256-bit key can be expressed as a hexadecimal string with 64 characters.
It will require 44 characters in base64.
see: https://security.stackexchange.com/a/45334/117318
Open iex
in your Terminal and paste the following line (then press enter)
:crypto.strong_rand_bytes(32) |> :base64.encode
You should see terminal output similar to the following:
We generated 3 keys for demonstration purposes:
- "h6pUk0ZccS0pYsibHZZ4Cd+PRO339rMA7sMz7FnmcGs="
- "nMd/yQpR0aoasLaq1g94FL/a+A+wB44JLko47sVQXMg="
- "L+ZVX8iheoqgqb22mUpATmMDsvVGt/foAe/0KN5uWf0="
These two Erlang functions are described in:
- https://erlang.org/doc/man/crypto.html#strong_rand_bytes-1
- https://erlang.org/doc/man/base64.html#encode-1
Base64 encoding the bytes generated by strong_rand_bytes
will make the output human-readable
(whereas bytes are less user-friendly).
- Bits and Bytes: https://web.stanford.edu/class/cs101/bits-bytes.html
- Thinking in Ecto - Schemas and Changesets: https://cultofmetatron.io/2017/04/22/thinking-in-ecto---schemas-and-changesets/
- Initialization Vector Length: https://stackoverflow.com/questions/4608489/how-to-pick-an-appropriate-iv-initialization-vector-for-aes-ctr-nopadding (128 bits is 16 bytes).
- What is the effect of the different AES key lengths? https://crypto.stackexchange.com/questions/3615/what-is-the-effect-of-the-different-aes-key-lengths
- How is decryption done in AES CTR mode?: https://crypto.stackexchange.com/questions/34918/how-is-decryption-done-in-aes-ctr-mode
- Block Cipher Counter (CTR) Mode: https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Counter_.28CTR.29
- Is AES-256 weaker than 192 and 128 bit versions? https://crypto.stackexchange.com/questions/5118/is-aes-256-weaker-than-192-and-128-bit-versions
- What are the practical differences between 256-bit, 192-bit, and 128-bit AES encryption? https://crypto.stackexchange.com/questions/20/what-are-the-practical-differences-between-256-bit-192-bit-and-128-bit-aes-enc
- How to Choose an Authenticated Encryption mode (by Matthew Green cryptography professor at Johns Hopkins University): https://blog.cryptographyengineering.com/2012/05/19/how-to-choose-authenticated-encryption
- How to choose an AES encryption mode (CBC ECB CTR OCB CFB)? (v. long answers, but good comparison!) https://stackoverflow.com/questions/1220751/how-to-choose-an-aes-encryption-mode-cbc-ecb-ctr-ocb-cfb
- AES GCM vs CTR+HMAC tradeoffs: https://crypto.stackexchange.com/questions/14747/gcm-vs-ctrhmac-tradeoffs
- Galois/Counter Mode for symmetric key cryptographic block ciphers: https://en.wikipedia.org/wiki/Galois/Counter_Mode
- What is the difference between CBC and GCM mode? https://crypto.stackexchange.com/questions/2310/what-is-the-difference-between-cbc-and-gcm-mode
- Ciphertext and tag size and IV transmission with AES in GCM mode: https://crypto.stackexchange.com/questions/26783/ciphertext-and-tag-size-and-iv-transmission-with-aes-in-gcm-mode
- How long (in letters) are encryption keys for AES? https://security.stackexchange.com/questions/45318/how-long-in-letters-are-encryption-keys-for-aes
- Why we can't implement AES 512 key size? https://crypto.stackexchange.com/questions/20253/why-we-cant-implement-aes-512-key-size
- Generate random alphanumeric string (used for AES keys) https://stackoverflow.com/questions/12788799/how-to-generate-a-random-alphanumeric-string-with-erlang
- Singular or Plural controller names?: https://stackoverflow.com/questions/35882394/phoenix-controllers-singular-or-plural
- What's the purpose of key-rotation? https://crypto.stackexchange.com/questions/41796/whats-the-purpose-of-key-rotation
- Postgres Data Type for storing
bcrypt
hashed passwords: https://stackoverflow.com/questions/33944199/bcrypt-and-postgresql-what-data-type-should-be-used >>bytea
(byte) - Do security experts recommend bcrypt? https://security.stackexchange.com/questions/4781/do-any-security-experts-recommend-bcrypt-for-password-storage/6415#6415
- Hacker News discussion thread "Don't use
bcrypt
": https://news.ycombinator.com/item?id=3724560 - Storing Passwords in a Highly Parallelized World: https://hynek.me/articles/storing-passwords
- Password hashing security of argon2 versus bcrypt/PBKDF2? https://crypto.stackexchange.com/questions/30785/password-hashing-security-of-argon2-versus-bcrypt-pbkdf2
- The memory-hard Argon2 password hash function (ietf proposal): https://tools.ietf.org/id/draft-irtf-cfrg-argon2-03.html unlikely to be a "standard" any time soon...
- Erlang Dirty Scheduler Overhead: https://medium.com/@jlouis666/erlang-dirty-scheduler-overhead-6e1219dcc7
- Erlang Scheduler Details and Why They Matter: https://news.ycombinator.com/item?id=11064763
- Why use argon2i or argon2d if argon2id exists? https://crypto.stackexchange.com/questions/48935/why-use-argon2i-or-argon2d-if-argon2id-exists
- Good explanation of Custom Ecto Types: https://medium.com/acutario/ecto-custom-types-a-practical-case-with-enumerize-rails-gem-b5496c2912ac
- Consider using ETS to store encryption/decryption keys: https://elixir-lang.org/getting-started/mix-otp/ets.html & https://elixirschool.com/en/lessons/specifics/ets
If you prefer to read, Ryo Nakao wrote an excellent post on understanding how AES encryption works: https://nakabonne.dev/posts/understanding-how-aes-encryption-works/
If you have the bandwidth and prefer a video, Computerphile (YouTube channel) has an great explaination:
To run a single test (e.g: while debugging), use the following syntax:
mix test test/user/user_test.exs:9
For more detail, please see: https://hexdocs.pm/phoenix/testing.html
When Ecto changeset
validation fails,
for example if there is a "unique" constraint on email address
(so that people cannot re-register with the same email address twice),
Ecto returns the changeset
with an errors
key:
#Ecto.Changeset<
action: :insert,
changes: %{
email: <<224, 124, 228, 125, 105, 102, 38, 170, 15, 199, 228, 198, 245, 189,
82, 193, 164, 14, 182, 8, 189, 19, 231, 49, 80, 223, 84, 143, 232, 92, 96,
156, 100, 4, 7, 162, 26, 2, 121, 32, 187, 65, 254, 50, 253, 101, 202>>,
email_hash: <<21, 173, 0, 16, 69, 67, 184, 120, 1, 57, 56, 254, 167, 254,
154, 78, 221, 136, 159, 193, 162, 130, 220, 43, 126, 49, 176, 236, 140,
131, 133, 130>>,
key_id: 1,
name: <<2, 215, 188, 71, 109, 131, 60, 147, 219, 168, 106, 157, 224, 120,
49, 224, 225, 181, 245, 237, 23, 68, 102, 133, 85, 62, 22, 166, 105, 51,
239, 198, 107, 247, 32>>,
password_hash: <<132, 220, 9, 85, 60, 135, 183, 155, 214, 215, 156, 180,
205, 103, 189, 137, 81, 201, 37, 214, 154, 204, 185, 253, 144, 74, 222,
80, 158, 33, 173, 254>>
},
errors: [email_hash: {"has already been taken", []}],
data: #Encryption.User<>,
valid?: false
>
The errors
part is:
[email_hash: {"has already been taken", []}]
A tuple
wrapped in a keyword list
.
Why this construct? A changeset can have multiple errors, so they're stored as a keyword list, where the key is the field, and the value is the error tuple.
The first item in the tuple is the error message, and the second is another keyword list, with additional information that we would use when mapping over the errors in order to make them more user-friendly (though here, it's empty).
See the Ecto docs for add_error/4
and traverse_errors/2
for more information.
So to access the error message "has already been taken"
we need some pattern-matching and list popping:
{:error, changeset} = Repo.insert User.changeset(%User{}, @valid_attrs)
{:ok, message} = Keyword.fetch(changeset.errors, :email_hash)
msg = List.first(Tuple.to_list(message))
assert "has already been taken" == msg
To see this in action run:
mix test test/user/user_test.exs:40
If you get "stuck", please open an issue on GitHub: /~https://github.com/nelsonic/phoenix-ecto-encryption-example/issues describing the issue you are facing with as much detail as you can.
Inspiration/credit/thanks for this example goes to Daniel Berkompas
@danielberkompas
for his post:
https://blog.danielberkompas.com/2015/07/03/encrypting-data-with-ecto
Daniel's post is for
Phoenix v0.14.0
which is quite "old" now ...
therefore a few changes/updates are required.
e.g: There are no more "Models" in Phoenix 1.3 or Ecto callbacks.
Also his post only includes the "sample code"
and is not a complete example
and does not explain the functions & Custom Ecto Types.
Which means anyone following the post needs
to manually copy-paste the code ...
and "figure out" the "gaps" themselves to make it work.
We prefer to include the complete "end state"
of any tutorial (not just "samples")
so that anyone can git clone
and run
the code locally to fully understand it.
Still, props to Daniel for his post, a good intro to the topic!