Sunday, July 31, 2011

[rdwxvzht] Deleting data out of backups

Consider a web service.  Users store information on it.  The service makes backups to guard against failure.

However, backups make it difficult for users to intentionally, actually delete data.  Even after deleting something on their live profile, adversaries can still recover the data from the backups.

Two solutions:

1. Run without local backups.  Provide a mechanism for the user to pull their entire account information and back it up themselves.  Cloud services might be useful.  In the event of some failure, the service appeals to the user to upload a backup copy.  If the user hasn't made a backup, then the account disappears.

Optionally, the account dump is encrypted by the service.  This prevents someone hacking an account and quickly downloading everything about the account by making a backup.  Facebook, for example, keeps track of everything you click, but does not provide a UI for accessing that information.

But such encryption does make it opaque whether the service actually deleted the data out of the live account.  Maybe not such a good idea.

2. Do make backups, but encrypt backups with the user's public key.  Each time a backup is made, the user is notified of its hashcode.  In the event of a failure, the user is provided with a blob to decrypt.  The user checks that it is a recent backup (by the hashcode), decodes it with the private key, and uploads it back to the service.  Once again, the "inner" backup might be encrypted by the service, so there are (at least) two layers of an onion.  Optionally, the user may download these backups.

We wish to avoid an "ancient history" attack where the adversary induces the service to fake a failure and provides a very old backup blob, which the user decrypts, thereby allowing access deleted data.  This is the reason for notifying with the checksum every time a backup is made.

There still remains a tricky problem where the service claims all recent backups were corrupt (perhaps forgot to backup some important data), and they do need to resurrect an ancient copy.  Yet another reason why the service encrypting is looking less and less like a good idea.  Backups should provide a way for a user to migrate to a different service provider, at which point the backup becomes tested as to whether it really contained all the necessary information.

There is probably a way so that the user does not need to download and decrypt the entire blob and upload, just to decrypt a symmetric key.  However, I don't know if we can avoid the ancient history attack.

Despite the title of this post, we're not actually deleting data out of backups, just making old backups inaccessible without the user's consent.

No comments :