The Bitcoin transaction graph has varied observable patterns, with pockets clustering of highest significance. A few of these patterns have been studied and used to hyperlink cash from the identical pockets, each in principle and observe.
Each transaction consists of an inventory of inputs (the place the sats are taken from) and outputs (the place the enter sats are distributed). Inputs consult with the outputs of earlier transactions, similar to connecting transactions. Outputs lock some quantity of bitcoin with sure spending situations (i.e., the “tackle,” public key, or output script). Linking cash means figuring out the entity that controls the keys to a set of transaction outputs, spent or unspent.
Part 10 of the Bitcoin white paper, “Privateness,” briefly discusses linking:
“A brand new key pair needs to be used for every transaction to maintain them from being linked to a typical proprietor.”
When the identical public key controls a couple of coin, these cash are trivially linked since just one entity is meant to know the non-public key.
Nevertheless, tackle reuse shouldn’t be the one concern. The paper continues:
“Some linking remains to be unavoidable with multi-input transactions, which essentially reveal that their inputs had been owned by the identical proprietor.”
That is sometimes called the “frequent enter possession heuristic,” CIOH, or the “multi-input heuristic.” It’s solely a heuristic as a result of, in contrast to the implication within the quote above, counterexamples exist. Though it isn’t at all times true, it usually is.
Over time, extra refined strategies for clustering have been developed, for instance, telling aside change outputs from funds or utilizing bigger constructions within the transaction graph than simply particular person transactions. A few of these have been described in tutorial work, whereas others stay proprietary. Improved strategies can hyperlink to extra cash or keep away from so-called “cluster collapse,” the place cash belonging to completely different customers are incorrectly linked. Industrial choices usually profit from further sources of data, similar to KYC information; they don’t essentially rely on simply the privateness leaks that happen within the Bitcoin protocol, however clustering remains to be the central theme.
This motivates an adversarial framing of privateness, the place a deanonymization assault makes an attempt to assign cash to clusters. From this angle, defending privateness means making it harder for the adversary to achieve appropriately assigning cash to clusters. Essentially the most notable examples contain collaborative transaction building, whether or not it’s overtly troublesome to guess, as in CoinJoin, or covertly as in PayJoin, or maybe most prominently, simply part of how the software program works as with Lightning node transactions. In all instances, the simplistic assumption of frequent possession breaks down, necessitating a extra nuanced evaluation.
The adversarial framing additionally makes it express that completely different adversaries have completely different capabilities, with the suitable adversarial mannequin relying on the consumer’s risk mannequin: Are you extra frightened about surveillance by an oppressive authorities or snooping by your transactions’ counterparties?
Initially printed on the Spiral Substack.