library(sodium)
De-identify
To securely encrypt sensitive data such as a Hospital Number (HN) in R, you can use cryptographic libraries like openssl
or sodium
. These libraries provide robust encryption and decryption functions.
Using sodium
Steps
- Generate a Secret Key: A secret key will be used for encryption and decryption.
- Encrypt the HN: Convert the Hospital Number into a cipher text (encrypted form).
- Decrypt the HN: Retrieve the original Hospital Number with the secret key.
Example
- Generate a secret key (must be securely stored and shared)
set.seed(123)
<- sodium::keygen() # Generates a 32-byte random key
key key
[1] e9 2d 42 ac c4 2f 16 76 b6 8c 0a e2 82 25 54 72 b7 f5 62 e0 62 0c 12 78 e1
[26] 8e f6 a9 d6 46 7f 7b
Save the key securely, e.g., in a secure environment variable or encrypted storage
saveRDS(key, here("data/de-iden/secret_key.rds"))
- Function to encrypt the Hospital Number
<- function(hn, key) {
encrypt_hn
<- charToRaw(hn) # Convert HN to raw bytes
hn_raw <- sodium::data_encrypt(hn_raw, key)
encrypted_hn
encrypted_hn
}
encrypt_hn("123", key = key)
[1] 63 e8 30 ca f8 36 15 28 74 63 98 93 3e eb ea f7 f9 ce 99
attr(,"nonce")
[1] de 33 d4 b4 d8 41 5c b2 75 51 b0 37 fe 25 78 7b e0 79 70 a8 fe 70 d2 ad
- Function to decrypt the Hospital Number (with permission)
<- function(encrypted_hn, key) {
decrypt_hn
<- sodium::data_decrypt(encrypted_hn, key)
decrypted_raw <- rawToChar(decrypted_raw) # Convert decrypted raw bytes back to character string
hn
hn }
Example usage:
<- "123456" # Hospital Number
hn <- encrypt_hn(hn, key) # Encrypt HN
encrypted_hn print(encrypted_hn) # Display encrypted value (not readable)
[1] f4 ce eb 8b b2 2a 49 7a ad bf 0f fd 9f 8b 42 56 b5 4b 9a a0 4e 5c
attr(,"nonce")
[1] ff d1 5d c9 c3 0e 00 e2 18 f1 37 32 77 54 b7 3f 5e b9 22 fb 01 08 8e 55
# Now decrypt (ensure permission and use of the same key)
<- decrypt_hn(encrypted_hn, key)
decrypted_hn print(decrypted_hn) # Display original HN
[1] "123456"
Key Points:
- Secret Key: You need to securely store and protect the key (e.g., using an encrypted vault or environment variable). Without the key, the data cannot be decrypted.
- Encryption: The
data_encrypt
function ensures that the data is transformed into an unreadable format (ciphertext). - Decryption: The
data_decrypt
function reverses the process, recovering the original data, but only with the correct key.
Security Considerations:
- Key Management: The secret key must be handled carefully, ensuring only authorized users can access it.
- Permission Controls: Implement proper access control for who can decrypt the data, potentially using access logs or two-factor authentication.
This approach provides secure encryption and decryption for sensitive data such as Hospital Numbers, making it recoverable only by authorized personnel with the correct permissions.