joinmarket-clientserver icon indicating copy to clipboard operation
joinmarket-clientserver copied to clipboard

Compute addresses entropy

Open inaltoasinistra opened this issue 7 years ago • 3 comments

I used this definition of entropy.

When a cj transaction occurs are assigned rates to output scripts. The rate represents the number of mapping of user inputs to tx outputs. The output rates are multiplied by the input rate, in order to track privacy among transactions.

This allows users to estimate the quality of their utxos.

Rates are saved into the wallet; rates of spent txo are deleted to avoid to bloat the wallet.

inaltoasinistra avatar Feb 10 '19 23:02 inaltoasinistra

I'm running this code for about a week, both with yield generator and sendpayment and haven't noticed any issues.

kristapsk avatar Feb 17 '19 23:02 kristapsk

First, thanks for this, sorry not to have responded a bit earlier but been on holiday (and note to @undeath @chris-belcher please don't be shy to do stuff without me!).

Second, I'm broadly in favour of doing some kind of quantitative measure, but (a) I'm not hugely enthusiastic, it'll never really be accurate, so slightly lukewarm because of that and (b) more code is always more work; people always want to add features, nobody ever helps improve existing ones! (open source maintenance grumble standard).

So for the general question of entropy measurement, I got quite interested in it after reading a bit more into @LaurentMT 's work (see my comment here ). But that's a separate discussion, there I'm more discussing how to analyse all transactions globally. Here, you're focused on Joinmarket and the assumptions we typically make about it.

On the exact proposed algorithm here, I'd just like to check I understand what you propose. Looking at these lines: https://github.com/JoinMarket-Org/joinmarket-clientserver/pull/331/files#diff-11f05e76035469edaa236f6441f8c418R101 (lines 101-106), the algorithm seems to make a couple of assumptions:

It assumes that the "starting rate/entropy" is the minimum of the set of input scripts that belong to us on the input side. For the change output, this makes total sense, as they are (usually) quite easy to link.

But I find that a bit dubious for the coinjoin output. The coinjoin outputs are intrinsically unlinkable to the input set they correspond to, by coinjoin's nature. So perhaps an ideal measure might be the average entropy of all of the input subsets, except of course, we don't know those values. Your proposed version may end up being conservative (given the min()), but even that's not clear, we don't know how our starting entropy compares with other participants ... but also it almost feels like it's missing the point of what the entropy measure is supposed to be doing; it's supposed to be capturing the fact that there are multiple interpretations of the (input -> coinjoin output) mapping.

Perhaps it's a case of "well it's the only thing we can really calculate", but I'd like to hear more what others think.

AdamISZ avatar Mar 02 '19 15:03 AdamISZ

First, thanks for this, sorry not to have responded a bit earlier but been on holiday (and note to @undeath @chris-belcher please don't be shy to do stuff without me!).

I'm sorry for the big delay and thank you for the precious feedback.

Second, I'm broadly in favour of doing some kind of quantitative measure, but (a) I'm not hugely enthusiastic, it'll never really be accurate, so slightly lukewarm because of that and (b) more code is always more work; people always want to add features, nobody ever helps improve existing ones! (open source maintenance grumble standard).

I understand this. I'm also afraid because the real entropy could be very different from the estimation. Even a future transaction could leak information and change the entropy of an utxo that is already spent.

On the other hand I think that an estimation could help the joinmarket user to manage the wallet. I deposit coins to different accounts, because I'd like to avoid to correlate different sources of coins among them. In this situation I don't know if an account contains mixed coins or not.

It assumes that the "starting rate/entropy" is the minimum of the set of input scripts that belong to us on the input side. For the change output, this makes total sense, as they are (usually) quite easy to link.

But I find that a bit dubious for the coinjoin output. The coinjoin outputs are intrinsically unlinkable to the input set they correspond to, by coinjoin's nature. So perhaps an ideal measure might be the average entropy of all of the input subsets, except of course, we don't know those values. Your proposed version may end up being conservative (given the min()), but even that's not clear, we don't know how our starting entropy compares with other participants ... but also it almost feels like it's missing the point of what the entropy measure is supposed to be doing;

I did not measure the entropy of the transactions, but the entropy of the utxos in relation to the wallet. I tried to answer to the question "How far is this utxo from my wallet?". I used min(inputs) because I assumed that checking the amounts it is possible to link the inputs of the same wallet together.

inaltoasinistra avatar Mar 24 '19 18:03 inaltoasinistra