refactor: Simplify disableLookahead and improve numDecodingEngineTokens handling
- Move
numDecodingEngineTokensfromDecoderState->mJointDecodingInputtoDecoderStateitself.- It's not needed in the inputs, but in the outer decoding loop.
- Simplify disableLookahead
- Don't take the batch size as a parameter, but use the current
mMaxBatchSizeof theDecoderState.
- Don't take the batch size as a parameter, but use the current
/bot run
PR_Github #577 [ run ] triggered by Bot
PR_Github #577 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #492 completed with status: 'SUCCESS'
/bot run
PR_Github #675 [ run ] triggered by Bot
PR_Github #675 [ run ] completed with state SUCCESS
/LLM/main/L0_MergeRequest_PR pipeline #566 completed with status: 'SUCCESS'
/bot reuse-pipeline
PR_Github #894 [ reuse-pipeline ] triggered by Bot
PR_Github #894 [ reuse-pipeline ] completed with state SUCCESS
Reusing PR_Github #675 for commit cfb248d