Home · Posts
Anonymizing Bitcoin
Sept. 10, 2011

Bitcoins are not anonymous, but they can be made more so. All existing "laundries" are ineffective. These are my thoughts on why, and how an ideal anonymity-strengthening relay (ASR) might work. I'd been waiting to post this until I reached more definitive conclusions, but Kaminsky's announcement of BlitCoin made me think it was better to get a discussion started earlier. A more secure Bitcoin is a more useful Bitcoin. There are just a couple ideas here, mostly obvious, taken to their logical conclusion.

The existing so-called "laundry" services work as follows: Coins are sent in, possibly from multiple origin addresses. The laundry mixes them around with other addresses, ideally shared between multiple ongoing launderings, and then send them to the destination address (provided out-of-band) in one or more transactions.

No matter how cleverly a laundry mixes up the coins, though, X coins go in one end and X coins come out the other. This might be obfuscated by using more transactions, using more intermediate addresses, or spreading them out over time. but ultimately a sufficiently motivated adversary can calculate X and identify the corresponding input and output transactions. To the right algorithm, current laundry attempts are nothing more than an inconvenience. Making the intermediate transactions more wild may even be counterproductive, leaving an identifiable pattern in the block chain. Rather than security by obscurity, our ideal ASR would rely on a simple yet rigorous design. Here are three changes that I believe would get us to such a solution:

  1. Multiple outputs in the same amount. Consider two people sending transactions through the ASR, in amounts X and Y such that X > Y. If the ASR forwards to output addresses A, B, and C Bitcoins in the amounts of Y, Y, X-Y, then the addresses A and B and have, to an extent, been anonymized. With existing laundries, there would only be two outputs, which ultimately have balances X and Y, making the transactions traceable. This type of operation can (and should) obviously be generalized to more simultaneous inputs and more output addresses per input. Notice the two things our ASR needs: support for multiple output addresses (ideally, unlimited), and volume. Attaining volume means low or nonexistent fees, and usage for a variety of transactions, maybe even those that don't need security (more on this later).
  2. Outputs of as few amounts as possible. Anonymity is not binary, but a spectrum: the coins from the example above are "2-anonymized", meaning that there are two possible origins they could have come from. The more outputs in the same amount, the better. Therefore, the fewer size-buckets we can put our outputs into, the better. It would be nice to have some defined set of buckets that is small, but can conveniently accommodate the full range of Bitcoin amounts (1 Satoshi to 21M Bitcoin). There's a tradeoff here: we want fewer buckets (for security), but more buckets would allow us to use fewer output addresses per transaction. An exponentially distributed set of buckets (…0.1, 1, 100, 1000…) is an obvious solution. Using base-10 is a good choice because it's secure, easy to understand, and matches the protocol, but base-2 is also reasonable and would require fewer output addresses.
  3. Different levels of anonymity.Different ASR use-cases need different levels of anonymity. Many transactions don't need any at all. By using the ASR anyway, low-security transactions can provide liquidity for high-sec transactions. So ideally, every Bitcoin transaction should be sent through one or more ASRs. Our ideal ASR should let you choose, along with your destination addresses, a desired security level and possibly a time limit. For example, low-security transaction might require the ASR to make it 1-anonymized (i.e. not at all), but offer to let the coins wait for up to an hour before being output, in case they are useful in providing anonymization liquidity to another transaction. A high-security transaction might ask that as many coins as possible be 5-anonymized, with the remainder going to a designated "overflow" address (possibly for later retry) if they can't be secured within the given time.

The ASR should of course have an API allowing users to creating input/output bindings with all the above mentioned parameters, in addition to a time limit for how long the created binding should be retained in memory (for security). Integrating this API into everyday Bitcoin services would improve the security for everyone.

I've avoided the subject of transaction fees and funding the ASR, but these can be addressed pretty trivially. I've also avoided talking about how outputs would be timed, but a batch approach seems safe, while a "randomized stream" approach seems tricky to make truly secure.

There are a number of obstacles to the creation of such an ASR. Trust in the ASR is perhaps the biggest. The ASR would have to be trusted to not steal coins; this could come with time. It would have to be trusted to be secure, and securely erase input-output associations, or else there would be no gain at all in anonymity; this could be addressed by using multiple chains ASRs for a transaction.

Increased transaction times would be one inconvenience. Client support is another, and would probably be critical to get people using long lists of output addresses. (You could enter the amount to get a string combining multiple receipt addresses, and the sending side could support the same packing format.)

The ASR would have to maintain transaction volume in order to provide transaction security. This is something of a chicken-and-egg problem. One way to address might be to start by only supporting outputs of one size (e.g. 10 BTC) and expanding from there.

Finally, users have to be careful not to de-anonymize their Bitcoins. If the same coins get sent through too rapidly, it's probably just noise, and transactions nearby in time are less secure. Or, the user from the example in (1) could combine their coins from outputs B and C.

For the future, improvements to anonymity could even be incorporated into the next iteration of Bitcoin protocol. Though this would require more thought, something like: only allow addresses in some small set of sizes (e.g., powers of 2), and don't allow multiple inputs to an address, except to combine two smaller values. The obvious downside is that many destination addresses would have to be provided for common amounts. As above, client support on both ends would greatly mitigate this inconvenience.

Thanks to Andrey Petrov for feedback.