Benjamin Schiller
Results
2
issues of
Benjamin Schiller
I noticed something unexpected when comparing two scenarios for a model converted via ONNX and TensorRT (distilroberta with classification head): 1. Scenario: I use a dataset with varying sentence lengths...
Hello, for pre-training the model, what kind of training curriculum did you use? Did the model see the training records control code by control code sequentially or did you feed...