Michael Chaly comments

Results 23 comments of


                                            Michael Chaly

SPRT bounds V2

Also I want to start a discussion for separate SPRT bounds for multicore tests - current one, imho, are simply too strict, especially for LTC, and discourage people from even...

SPRT bounds V2

well. We can do like... "dynamic SPRT bounds" I guess, so recalculate them on fly for every run to get estimated peak number of converging games 100k and 50% pass...

SPRT bounds V2

Or I guess logically 2 params will be enough - then we will have number of games to converge and regression % being constant, using draw rate from the test...

SPRT bounds V2

So, do we have anything on this topic? We may discuss further improvements and now maybe do what we should do - adjust bounds, because LTCs are getting like impossible...

SPRT bounds V2

So... If anything draw rate increased again - current LTC draw rate sits at like 92,6% - and this will make my proposed bounds to converge in less games than...

SPRT bounds V2

I still want to come back to this. Recently we really lack progress and with new type of elo calculation bounds will be different anyway. So we can use this...

SPRT bounds V2

Well sure it will be nice to implement normalized elo in fishtest but it's under construction (I guess?). For now we can just temporarily adjust our current logic bounds.

SPRT bounds V2

in terms of normalised elo current STC is {-0.6; 3} we can change it to nElo{-0.5; 2.4} - which will correspond to {-0.2; 1.0} LTC should be smth like {0,75;...

https://tests.stockfishchess.org/tests/view/5f2df92061e3b6af64882012 Update on issue - with NNUE it seems that there are some workers that are "cut off" even if they are not bad ones because of higher draw rates....

Residual calculation

@vondele there seem to be smth broken badly https://tests.stockfishchess.org/tests/view/5f2e45e761e3b6af6488204d this test can't converge because it gets to some +5 / +3.5 LLRs and then workers that don't even look bad...