bug: transactions not correctly associated with addresses in v0.5
System Information (if applicable)
BlockSci version: 0.5 Using AMI: no Total memory: 32GB
auto a = getAddressFromString("3ChVP627KU5w4zu2rieFPF3wGXWQgmhvrs",chain->getAccess());
auto equivAddress = (*a).getEquivAddresses(false);
auto pointers = equivAddress.getOutputPointers() | ranges::to_vector;
std::cout << "Equiv Address: " << pointers.size() << " pointers" << std::endl;
----- Result -------- Equiv Address: 1 pointers
But according to blockchain.info address got 59 transactions https://www.blockchain.com/btc/address/3ChVP627KU5w4zu2rieFPF3wGXWQgmhvrs
I suspect I miss something regarding the type of the address as my parser folder seems sain.
Can someone help me with that ?
Thanlks in advance, Clément
Second line should be the following (is that just a typo in your example?):
auto equivAddress = a.getEquivAddresses(false);
Running this on 0.6 gives me Equiv Address: 54 pointers (on a slightly outdated state of the blockchain)
Yes sorry bad copy / paste ( I updated first message); So its a bug of 0.5 if I read you correctly ?
So its a bug of 0.5 if I read you correctly ?
No, I haven't been able to test this on v0.5 yet to see if it's reproducible.
Can reproduce this in v0.5. My best guess is that it's related to #217: if I retrieve the address through the output of a transaction, I get a different scriptNum than the one from getAddressFromString. However, this seems to be fixed in v0.6.
Made the same diagnosis as yours. I'm currently re-parsing chain to see if the bug is solve for me in 0.6
Hi. I encountered this bug on v0.5. The bug is also reproduced on the PUBKEYHASH addresses: 183hmJGRuTEi2YDCWy5iozY8rZtFwVgahM, 1FeexV6bAHb8ybZjqQMjJrcCrHGW9sb6uF, etc. (from the rich list).
>>> chain.address_from_string("183hmJGRuTEi2YDCWy5iozY8rZtFwVgahM").balance()
777
>>> chain.address_from_string("1FeexV6bAHb8ybZjqQMjJrcCrHGW9sb6uF").balance()
777
This is caused by the bug at AddressState::reloadBloomFilter in the parser:
-
reloadBloomFilter<blocksci::AddressType::MULTISIG_PUBKEY>()is called -
reloadBloomFilterclearsaddressBloomFilterforblocksci::DedupAddressType::PUBKEY -
reloadBloomFilterreloads addresses fromdb.db.getAddressRange<blocksci::AddressType::MULTISIG_PUBKEY>()instead ofdb.db.getAddressRange<blocksci::AddressType::PUBKEY>() -
findAddressreturnsAddressLocation::NotFoundfor the existing address -
resolveAddressassigns newaddressNumto the existing address
In the same way for SCRIPTHASH, reloadBloomFilter<blocksci::AddressType::WITNESS_SCRIPTHASH>() is called.
It seems that the bug is fixed in v0.6 on 2bc7f72aa4f4a29c840411ab175feadc1d849e1e
(sizeIncreaseRatio is also important. Fixing only type is resulting in massive reloading).
But I would appreciate it if you could backport the patch to v0.5 since I think this bug causes seriously wrong analysis.
Unfortunately I didn’t understand the impact of this issue when it was brought up last year. I re-visited this now as I was preparing for a new release, and it indeed seems to impact all addresses that have been re-used after certain block heights.
As correctly identified by ytoku, the bug is caused by an incorrect reloading of the parser’s bloom filter at a specific block height (for Bitcoin: at height 580383 for Pubkey/Pubkey-Hash/Witness-Pubkey-Hash addresses, at height 572072 for Scripthash/Witness-Scripthash addresses). Addresses that were reused afterwards are not correctly associated with their previously assigned address ID, but instead received a new ID. This new ID is also returned by the address index.
Due to how the parser caches addresses, subsequent address occurrences could receive the old ID again, leading to two possibilities:
- If the address had only been seen once before the incorrect reloading, subsequent uses would use the new ID. In this case, looking up the address by string would return an address that misses the first use.
- If the address had been seen more than once before, subsequent occurrences would receive the old ID again. In this case, looking up the address by string would return an address that shows only a single use.
I’ve compiled some more details here. Users of the v0.6 branch (after Oct 2018) should not have been affected by this.