Daniel Melcer
Daniel Melcer
Not 100% sure but I believe this was with Starcoder2-15b, temperature was somewhere between 0.7 and 1.
Taking a fresh look at this again, it seems that a workaround may be to do something like: ```python sample, info = rb.sample(minibatch_size, return_info=True) sample["next", "end_of_slice"] = ( info["next", "truncated"]...
Thanks for responding so quickly! In my particular case, I am collecting a few episodes (of wildly varying length), training on a few large-ish batches on short-ish slices, and then...
This seems like a major improvement to the documentation! Thanks for updating that.