Jesse Moore
Results
1
comments of
Jesse Moore
This seems a little odd. The readme explicitly states that the max_seq_length of the data prep and the training steps don't need to be identical. > Longer sequences are disproportionately...