Daniel Melcer

Results 4 comments of Daniel Melcer

Not 100% sure but I believe this was with Starcoder2-15b, temperature was somewhere between 0.7 and 1.

Taking a fresh look at this again, it seems that a workaround may be to do something like: ```python sample, info = rb.sample(minibatch_size, return_info=True) sample["next", "end_of_slice"] = ( info["next", "truncated"]...

Thanks for responding so quickly! In my particular case, I am collecting a few episodes (of wildly varying length), training on a few large-ish batches on short-ish slices, and then...

This seems like a major improvement to the documentation! Thanks for updating that.