Nethereum.BlockchainStorage icon indicating copy to clipboard operation
Nethereum.BlockchainStorage copied to clipboard

SQL Server LastBlockProcessed is set to NVARCHAR

Open jpmyburghbc opened this issue 3 years ago • 2 comments

The following method does a MaxAsync on the BlockProgress table but the LastBlockProcessed column is set to NVARCHAR in the SQL scripts supplied, as a result, the query will return a value to the nearest 9, 99, 999 etc and not the actual last block processed, this can cause the processor to re-process a large number of blocks when restarted.

https://github.com/Nethereum/Nethereum.BlockchainStorage/blob/60e62b7e47a881519b4c3a5cfc003648c68fd4d3/src/Nethereum.BlockchainStore.EF/Repositories/BlockProgressRepository.cs#L18

public async Task<BigInteger?> GetLastBlockNumberProcessedAsync() { using (var context = _contextFactory.CreateContext()) { var max = await context.BlockProgress.MaxAsync(b => b.LastBlockProcessed).ConfigureAwait(false); return string.IsNullOrEmpty(max) ? (BigInteger?)null : BigInteger.Parse(max); } }

Steps to recreate

  1.    Add 19999 blocks in the `BlockProgress` table
    
  2.    Now run `SELECT Max([LastBlockProcessed]) FROM [BlockProgress]`
    
  3.    Result returned 9999 as LastBlockProcessed resulting in 10k blocks needing to be reprocessed
    
  4.    Expected query to return 19999 as LastBlockProcessed
    

jpmyburghbc avatar Nov 12 '22 23:11 jpmyburghbc

Thanks for this.. this was changed to NVARCHAR as blockNumbers are bigIntegers, but this progress repository (even if simple) should not be storing more than one row at a time..

Maybe adding something like this will be better ->

 var oldEntries = await context.BlockProgress.Where(x => x.LastBlockProcessed != blockEntity.LastBlockProcessed).ToListAsync().ConfigureAwait(false);
                    context.BlockProgress.RemoveRange(oldEntries);
                    await context.SaveChangesAsync().ConfigureAwait(false);

https://github.com/Nethereum/Nethereum.BlockchainStorage/blob/60e62b7e47a881519b4c3a5cfc003648c68fd4d3/src/Nethereum.BlockchainStore.EF/Repositories/BlockProgressRepository.cs#L27

@jpmyburghbc

juanfranblanco avatar Nov 15 '22 12:11 juanfranblanco

Thank you for taking the time to respond. I understand the usage better now.

Your recommended fix will help keep the table clean, and that way you can also utilise a SingleOrDefault()/FirstOfDefault() instead of MaxAsync() on GetLastBlockNumberProcessedAsync() which will be better on an NVARCHAR column.

I guess another alternative is to set that SQL column to bigint on in the SQL script, then the MaxAsync() will work as expected. I am only looking at the EF implementation so it's probably not an option on the other processors.

Either way, great work on this, Nethereum is doing great work for the crypto community, we are using it on Quorum and it's working great! Feel free to close the issue if you do not see an action item on this.

jpmyburghbc avatar Nov 18 '22 04:11 jpmyburghbc