Goal

Last week I played with the aes-gcm-siv crate in a separate repository. I made a toy that encrypts and decrypts and has interfaces that are more comfortable for me to work with than the library provides me with. This comfort is important to me: by making it as effortless as possible to do things in the correct way for correctness from a cryptography point of view, I'm more likely to avoid mistakes that would ruin everything.

Today's goal is to port this code into the Obnam 3 repository and use it to implement chunk encryption. I'll take the existing branch that provides the scaffolding for this and add the toy code.

Plan

  • Make sure the existing branch still builds and its tests pass.
  • Pick a good starting point.
  • Add a new sub-module (obnam::cipher::aead) for the ported code from the toy.
  • Implement the scaffolding (obnam::cipher) to use the new module.
  • Refactor as appropriate.
  • Fix test suite so it passes again. Add missing tests, if any.
  • Use the new cipher module in chunk encryption.

Notes

  • Made tea.

Pick starting point and run test suite

  • Looked at the branch two weeks ago from where I messed things up. I don't think I want to salvage anything. Some bits are OK, but it's easier to re-implement them than to tease them out from the pile of spaghetti.
  • The branch from the week before is a better starting point. It's tests pass.

Add cipher engines

  • First, define a trait for wrapping encryption implementations. I expect Obnam to need to support several different ones in the long run, and I want them all to have the same interface, to make them more comfortable to use. The API will be inspired by AEAD.
  • I'm calling the trait CipherEngine.
  • Each engine needs to have its own type for key and encrypted data. I don't want to tie myself down by specifying something now and be stuck if it doesn't work in the future.
  • I think I'll want to have each engine instance use a specific key. If another key is needed, a new engine is needed. I want this because based on experience, I expect this will make it more difficult for me to use the wrong key value.
  • I'll also want a type for plain text, to avoid easy mistakes like treating plain text as ciphertext or vice versa.
  • Created obnam::cipher with the trait and a plain text type. The plain text type is the same for all cipher engines. Here's what I ended up with, after some modifications while implementing the AEAD cipher engine, and omitting some helpful From traits for Plaintext:
pub struct Plaintext {
    blob: Vec<u8>,
}

pub trait CipherEngine {
    type Error;
    type Ciphertext;
    fn encrypt(&self, message: &Plaintext, ad: &Plaintext)
        -> Result<Self::Ciphertext, Self::Error>;
}
  • Created obnam::cipher::aead based on the toy. It's quite an identical copy to fit into the trait better, but all the interesting bits are.
  • Arguably it'd make sense to type associated data separately from message plain text, so I'll do that.
  • I'll a second cipher engine, as I did in the messed up branch, so I can force any code that deals with cipher engines to have to deal with more than one.
  • This also gives me a better feeling of whether the cipher engine trait I made is nice. It is, so far.
  • Refactored the code so commits in the branch tell a reasonably clear story.

Chunk encryption

  • The scaffolding for encrypted chunks doesn't yet do any encryption. In fact, the scaffolding is a little incomplete. For example, the method to encrypt a chunk is called encode, not encrypt.
  • Renamed the methods and updated their documentation to talk about encryption instead of encoding.
  • The methods need to get the key and label to use. And chunk id. The label and id will form the associated data. This should be versioned, too, like chunks. However, I don't expect to be able to use any version of the metadata for any version of a chunk.
  • In fact, while I've been carefully versioning types, I'm starting to feel there's a combinatorial explosion going on. That's also bad for long term maintainability, even if the versioned types are meant to help with exactly that. So for now I'm going to have just one version of chunk meta data. If I ever need to change that, I'll introduce a new type then. I am, however, somewhat confident that id and label is enough for a long time. I don't want more metadata, as all plain text metadata needs to be carefully controlled so it's not leaking information.
  • But I do need to keep id and label separate as they're going to be the two ways of accessing chunks. The id for "I want this specific chunk" and label for "all the chunks that have this label", for de-duplicating file data and chunks that contain something else than file data.
  • Hm, I think I want to not have EncryptedChunk encrypt itself. That seems messy.
  • I've run over my allocated time already. I'll stop here an continue next time.

Summary

I again made some progress. Not a great leap forward, but short steps are a better way to maintain pace. I recognized that I may have complicated my types with excessive versioning and combination, and I'll think about that in the future.

Comments?

If you have feedback on this development session, please use the following fediverse thread: https://toot.liw.fi/@liw/114408706453030619,