Every account has a public/private keypair that's generated during setup. The keypair is encrypted (either with a symmetric key based on your password or some other keyphrase you know) and stored with the account. Due to strong crypto, you're the only one who can get it back out.
Sending a DM then becomes a multi-step process.
1) Decrypt your key
2) Fetch the public keys of the other parties in the convo
3) Create an ephemeral public/private keypair for the conversation
Reading a message then requires:
1) Using your private key and the ephemeral public key to decrypt your copy of the symmetric key
2) Decrypting the message itself
3) Verifying the signature from the original author.
It looks like a lot of steps but, really, this is fairly straight-forward to abstract, even in browser-based crypto.
This scheme also have a lot of advantages
First - public keys are public and tied to each account. Super easy for discovery.
Second - they're built into the account (no manual key management) but still well-protected. You could also change your passphrase or even generate a new keypair if you want.
Third - sign-then-encrypt means an outside observer cannot _prove_ you sent a particular message unless they're party to said message.
Fourth - an outside party can attempt to impersonate you, but message recipients won't be fooled.
By placing the block of keys in an attachment, you don't artificially inflate the size of the message. Could even put the signature on the plaintext as part of the attachment itself (still debating that).
Could further obfuscate the key block by indexing each key based on a _hash_ of the recipient's ID. Recipients will easily find their key but an attacker will have a harder time at it.
Ephemeral keys mean each message is independent. Even if you somehow decrypted one message in the convo you won't be able to read any others (i.e. perfect forward secrecy).
Given the way things are structured, such a scheme could fit into Mastodon today with minimal changes - maybe even as a browser extension to test it out?
It's not _quite_ OTR (but similar). It's also not a double ratchet (Signal) but the ephemeral keys provide similar utility and allow for multiple parties in the convo.
Everything could be implemented using the SubtleCrypto API as it stands today with broad platform/browser support.
The only thing to think through is threat model.
Browser-based crypto gets a bad rap because of how easily broken the secrecy part of it has become. Every browser could have tens of men in the middle (other browser extensions) eavesdropping on your side of the conversation.
The question is whether this is "good enough" security for this particular use case. I think it could be ...
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!