Encrypt a string

Sometimes it would be useful to encrypt a value in a CommCare form and store the encrypted result. A typical reason would be that you have a value that is both a key in a database outside of CommCare, such as an external case ID, and one that contains personally identifiable information (PII). You would not want to pass the value itself through a third party, but you might be able to pass an encrypted value. For example, you might pass an encrypted case ID to third party visualization software that is permitted to access your database for approximate location information. The visualization software cannot decrypt the case ID, but your system can perform the decryption, access the database and return associated non-PII data.

CommCare only implements a form of encryption with shared secret keys, so it is important to realize that this does not hide any information from CommCare itself, which will have a copy of the key. If you need to store data in CommCare that even CommCare can't decrypt, then perform the encryption before the data is entered into a form. Please note that this encryption functionality will be available after CommCare's mobile 2.51 release. 

Encryption in a form

The CommCare encryption function is called 'encrypt-string' and a call to it looks like this:

encrypt-string('A simple message', 'VP1m9MQs8UZeaa2h+NkNqqbPkxBSFxYQNe9imEWl7tk=', 'AES')

All three arguments are strings. The first string is a message to be encrypted. The second string is a key suitable for the chosen encryption method. The third string is the name of the encryption method. Currently, only one encryption method named 'AES' is implemented, so it will be assumed for the rest of this documentation.

AES-GCM with 256-bit key

AES is an encryption standard that is implemented in many software libraries. It has several settings that affect how encryption and decryption work. For the encrypt-string function, CommCare uses the following settings:

  • encryption uses the GCM mode, so decryption must use the same mode

  • the function requires a 256-bit key, which you should produce with a good random number generator and store so you can decrypt messages later. The key passed to encrypt-string as the second argument should be encoded via Base64, so that it contains no unprintable characters.

  • encrypt-string chooses an initialization vector to improve security of repeated message encodings. The initialization vector is part of the encrypted output because it is needed for decryption.

Example

We will walk through an example of using encrypt-string on a message and decrypting it. In this example, we show code that you would run outside of CommCare using the Python language, but you can use any language with an implementation of AES.

First, produce a 256-bit AES key

You will need to generate a key that is shared between your system and CommCare. Treat this key like a password because anyone who knows it can read the encrypted values.

The following Python code produces a 256-bit key, encoded in Base64, which is suitable for encrypt-string. There are many ways to produce a random 256-bit value. The following code uses a system-supplied random number seed and produces a large random integer that is converted to bytes and then Base64 encoded.

import base64
import random
 
random.seed()
key_bytes = random.getrandbits(256).to_bytes(32, 'little')
encoded_key = base64.b64encode(key_bytes)

The value of encoded_key is a 'bytes' object with a value like b'VP1m9MQs8UZeaa2h+NkNqqbPkxBSFxYQNe9imEWl7tk='. Copy that value (without the initial b) and use it as the second argument to encrypt-string.

Second, encrypt a value in your form

The actual encryption will happen within your CommCare application. Within a form, you will compute a value using encrypt-string in a calculation like this:

encrypt-string(#form/input_data, #form/key_data, 'AES')

Here is a sample form in XML that calls encrypt-string and displays the result. You can upload the form to an application to try it yourself. Here is a picture of it in CommCare Web Apps:


The result of encrypt-string is a Base64 encoded string containing a sequence of N bytes. Your form will probably just store the whole string, but decoding it will require knowing the detailed format of the bytes, as follows:

  • Byte 0: the first byte is the length of the initialization vector. Call this IV_LEN, which will be between 1 and 255.

  • Bytes 1 to IV_LEN: the initialization vector, which will be used in the decryption code.

  • Bytes IV_LEN + 1 to N - 16: the encrypted message bytes.

  • Bytes N - 16 to N - 1: the last 16 bytes are an AES-GCM message authentication tag.

Different implementations of AES use different default initialization vector lengths, so it is necessary to encode the length as part of the result. Typical vectors are either 12 or 16 bytes long.

Third, decrypt a value outside of CommCare

The software that decrypts values will run in your own system outside of CommCare. For example, it might run in a server that accesses your database in response to requests from a third party. That software will use the shared key to decode the original value. The following Python code demonstrates the steps required.

import base64
# Using the pycrypto library
from Crypto.Cipher import AES
 
# Real code would read the secret key from some safe storage location.
key = "VP1m9MQs8UZeaa2h+NkNqqbPkxBSFxYQNe9imEWl7tk="
key_bytes = base64.b64decode(key)
 
# The output of CommCare's encrypt-string function, in Base64 encoding.
output = "DKXDL4xxJ7FLDDJ0WkgnKOmZQN0zF+YPBmZfKjH55Cx/B23f22xbsPK7KYwa"
output_bytes = base64.b64decode(output)
 
# First byte is the length of the initialization vector.
iv_len = output_bytes[0]
 
# Next iv_len bytes are the initialization vector.
iv = output_bytes[1:iv_len+1]
 
# The rest of the bytes are the encrypted message, followed by 16 bytes
# of message authentication tag, which we ignore here.
encrypted_message = output_bytes[(iv_len+1):-16]
 
# Initialize the decrypter with the key, the mode and the initialization vector.
cipher = AES.new(key_bytes, AES.MODE_GCM, nonce=iv)
 
# Decrypt the message received.
decrypted_bytes = cipher.decrypt(encrypted_message)
 
# Interpret the raw decrypted bytes as the type you expect.
print(decrypted_bytes.decode("utf-8"))



When using the encrypt-string function, it will be very important to test your decryption code with real examples of values encoded by CommCare, including testing the CommCare mobile application and Web Apps if you use both, because the cryptography implementations used in Android and CommCare Web Apps are different.