Tilps

Results 9 issues of Tilps

Using the optimizer's own gradient aggregator seems a better choice here - the default one handles corner cases the previous documentation does not, and if the user wants to conditionally...

stat:awaiting response from contributor
cla: yes

Currently every training run re-selects the test and training at random from a combined window. This makes it very difficult to track persistent divergence as any prior divergence gets nullified...

Investigation in to possible reasons for the cascading failure of value head quality after v0.8 was released, suggested PUCT and fpu-reduction changes as likely causes. If we believe that these...

Currently its quite difficult to work out how much the graph moves because the data has changed, vs how much the current lr has moved it in the first 2k...

Implementing the idea in #581. A key aspect here is that we shift the start time later, but we don't decrement the playouts by how many there are at that...

This is not ever for submission, but I wanted it somewhere I could get opinions.

Looking at the new ELF system for go, it seems they use a very subtly different fpu definition. Its kind of like get_raw_eval(), but rather than being the weighted average...

The very first playout in a move is far more expensive then any other (at least it is on my gpu). This means it takes a very long time for...

It was possible under go infinite to overflow visit counter if terminal nodes get a lot of exploration. As reported in #305