De-identify

To securely encrypt sensitive data such as a Hospital Number (HN) in R, you can use cryptographic libraries like openssl or sodium. These libraries provide robust encryption and decryption functions.

Using sodium

Steps

  1. Generate a Secret Key: A secret key will be used for encryption and decryption.
  2. Encrypt the HN: Convert the Hospital Number into a cipher text (encrypted form).
  3. Decrypt the HN: Retrieve the original Hospital Number with the secret key.

Example

library(sodium)
  1. Generate a secret key (must be securely stored and shared)
set.seed(123)

key <- sodium::keygen()  # Generates a 32-byte random key
key
 [1] e9 2d 42 ac c4 2f 16 76 b6 8c 0a e2 82 25 54 72 b7 f5 62 e0 62 0c 12 78 e1
[26] 8e f6 a9 d6 46 7f 7b

Save the key securely, e.g., in a secure environment variable or encrypted storage

saveRDS(key, here("data/de-iden/secret_key.rds"))
  1. Function to encrypt the Hospital Number
encrypt_hn <- function(hn, key) {
  
  hn_raw <- charToRaw(hn) # Convert HN to raw bytes
  encrypted_hn <- sodium::data_encrypt(hn_raw, key)
  encrypted_hn
  
}
encrypt_hn("123", key = key)
 [1] 63 e8 30 ca f8 36 15 28 74 63 98 93 3e eb ea f7 f9 ce 99
attr(,"nonce")
 [1] de 33 d4 b4 d8 41 5c b2 75 51 b0 37 fe 25 78 7b e0 79 70 a8 fe 70 d2 ad
  1. Function to decrypt the Hospital Number (with permission)
decrypt_hn <- function(encrypted_hn, key) {
  
  decrypted_raw <- sodium::data_decrypt(encrypted_hn, key)
  hn <- rawToChar(decrypted_raw) # Convert decrypted raw bytes back to character string
  hn
}

Example usage:

hn <- "123456"                       # Hospital Number
encrypted_hn <- encrypt_hn(hn, key)     # Encrypt HN
print(encrypted_hn)                     # Display encrypted value (not readable)
 [1] f4 ce eb 8b b2 2a 49 7a ad bf 0f fd 9f 8b 42 56 b5 4b 9a a0 4e 5c
attr(,"nonce")
 [1] ff d1 5d c9 c3 0e 00 e2 18 f1 37 32 77 54 b7 3f 5e b9 22 fb 01 08 8e 55
# Now decrypt (ensure permission and use of the same key)
decrypted_hn <- decrypt_hn(encrypted_hn, key)
print(decrypted_hn)                     # Display original HN
[1] "123456"

Key Points:

  • Secret Key: You need to securely store and protect the key (e.g., using an encrypted vault or environment variable). Without the key, the data cannot be decrypted.
  • Encryption: The data_encrypt function ensures that the data is transformed into an unreadable format (ciphertext).
  • Decryption: The data_decrypt function reverses the process, recovering the original data, but only with the correct key.

Security Considerations:

  • Key Management: The secret key must be handled carefully, ensuring only authorized users can access it.
  • Permission Controls: Implement proper access control for who can decrypt the data, potentially using access logs or two-factor authentication.

This approach provides secure encryption and decryption for sensitive data such as Hospital Numbers, making it recoverable only by authorized personnel with the correct permissions.