TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Test build image

Open ZhanruiSunCh opened this issue 2 months ago β€’ 14 comments

Summary by CodeRabbit

  • New Features

    • Docker image build and tagging workflows can now be triggered as independent stages in the CI pipeline.
  • Chores

    • Enhanced CI/CD infrastructure to support streamlined Docker image building and updates with CI-specific configurations.

✏️ Tip: You can customize this high-level summary in your review settings.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • [ ] Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

For guidance on mapping tests to stage names, see docs/source/reference/ci-overview.md and the scripts/test_to_stage_mapping.py helper.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

ZhanruiSunCh avatar Nov 25 '25 06:11 ZhanruiSunCh

/bot run --disable-multi-gpu-test --skip-test

ZhanruiSunCh avatar Nov 25 '25 06:11 ZhanruiSunCh

πŸ“ Walkthrough

Walkthrough

These changes introduce CI-specific build modes to Jenkins pipeline orchestration. BuildDockerImage.groovy adds a MODE variable set to "build_for_ci" and implements a new updateCIImageTag() function to manage CI image tags via git commits. L0_MergeRequest.groovy extends job launching capability with an onlyBuildImage parameter and new helper function to support focused Docker image builds with tag updates.

Changes

Cohort / File(s) Summary
CI Build Mode Configuration
jenkins/BuildDockerImage.groovy
Introduces MODE variable hard-coded to "build_for_ci" and GITHUB_CREDENTIALS_ID initialization. Adds updateCIImageTag() function that computes CI image tags from imageKeyToTag, updates jenkins/current_image_tags.properties, and performs signed commits with pushes to user fork.
CI Mode Pipeline Stage Management
jenkins/BuildDockerImage.groovy
Implements CI mode conditional logic filtering build configs to CI-related entries, adds "Update CI Image Tag" stage execution, conditionally skips artifact upload stage, and gates stages (Wait for Build Jobs Complete, Sanity Check, Register NGC Images) to skip in CI mode.
CI Image Tagging Logic
jenkins/BuildDockerImage.groovy
Extends Docker build steps to use config.stageName for CI image tagging in CI mode, enabling dependent images to be tagged into imageKeyToTag using CI stage identifiers.
Job Launching Extensions
jenkins/L0_MergeRequest.groovy
Extends launchJob() signature with optional onlyBuildImage parameter. Adds launchStages conditional logic to prune stage graph when onlyBuildImage is true, executing only Build-Docker-Images and Release Check stages.
CI Build Stage Helper
jenkins/L0_MergeRequest.groovy
Introduces launchBuildDockerImagesAndUpdateImageTags() helper function that delegates to launchStages with onlyBuildImage semantics. Adds new "Build Docker Images and update Image Tags" pipeline stage wired to run early in Preparation when applicable.

Sequence Diagram

sequenceDiagram
    participant Jenkins as Jenkins Pipeline
    participant BuildDocker as BuildDockerImage
    participant GitRepo as Git Repository
    participant Props as current_image_tags.properties

    Jenkins->>BuildDocker: MODE = "build_for_ci"
    Jenkins->>BuildDocker: Build Docker image (CI mode)
    BuildDocker->>BuildDocker: Tag dependent images to imageKeyToTag
    Jenkins->>BuildDocker: launchJob with onlyBuildImage=true
    BuildDocker->>BuildDocker: updateCIImageTag()
    BuildDocker->>BuildDocker: Compute four CI image tags from imageKeyToTag
    BuildDocker->>Props: Read current lines for four specific keys
    BuildDocker->>Props: Replace with new CI tags
    BuildDocker->>GitRepo: Commit changes (signed, with Signed-off-by)
    BuildDocker->>GitRepo: Push to user fork on current branch
    GitRepo-->>Jenkins: Update complete

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Git operations logic: The updateCIImageTag() function performs signed commits and git operations with Signed-off-by metadata extraction from latest commitβ€”verify correct commit message handling and authentication.
  • CI mode conditional logic: Multiple conditional branches filtering configs, skipping stages, and using CI stage names require careful tracing of all code paths affected by MODE variable.
  • Function signature extension: Verify onlyBuildImage parameter threading through launchJob and launchStages doesn't introduce unintended side effects on existing callers.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely composed of the template with no concrete implementation details, change summary, test cases, or completed checklist items provided. Replace template sections with actual content: add a concrete Description of the changes, list Test Coverage details, and complete the PR Checklist items. Reference the raw_summary for details about CI-specific image building, tagging updates, and mode handling.
Title check ❓ Inconclusive The title 'Test build image' is vague and generic, lacking specificity about the actual changes made to the codebase. Follow the repository template: [TICKET][type] Summary. For example: [TRTLLM-1234][feat] Add CI-specific image build and tagging pipeline, or use @coderabbitai title for auto-generation.
βœ… Passed checks (1 passed)
Check name Status Explanation
Docstring Coverage βœ… Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
πŸ§ͺ Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

[!TIP]

πŸ“ Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests β€” including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. πŸ“ Description β€” Summarize the main change in 50–60 words, explaining what was done.
  2. πŸ““ References β€” List relevant issues, discussions, documentation, or related PRs.
  3. πŸ“¦ Dependencies & Requirements β€” Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. πŸ“Š Contributor Summary β€” Include a Markdown table showing contributions: | Contributor | Lines Added | Lines Removed | Files Changed |
  5. βœ”οΈ Additional Notes β€” Add any extra reviewer context. Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❀️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Nov 25 '25 06:11 coderabbitai[bot]

PR_Github #7 [ run ] triggered by Bot. Commit: 318c32c

tensorrt-cicd avatar Nov 25 '25 06:11 tensorrt-cicd

PR_Github #25668 [ run ] triggered by Bot. Commit: 318c32c

tensorrt-cicd avatar Nov 25 '25 06:11 tensorrt-cicd

PR_Github #7 [ run ] completed with state ABORTED. Commit: 318c32c

tensorrt-cicd avatar Nov 25 '25 06:11 tensorrt-cicd

PR_Github #25668 [ run ] completed with state FAILURE. Commit: 318c32c /LLM/main/L0_MergeRequest_PR pipeline #19453 (Partly Tested) completed with status: 'FAILURE'

tensorrt-cicd avatar Nov 25 '25 06:11 tensorrt-cicd

PR_Github #25679 [ run ] triggered by Bot. Commit: 5409dd8

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25679 [ run ] completed with state ABORTED. Commit: 5409dd8 /LLM/main/L0_MergeRequest_PR pipeline #19462 (Partly Tested) completed with status: 'ABORTED'

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25680 [ run ] triggered by Bot. Commit: 8dbb468

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25680 [ run ] completed with state FAILURE. Commit: 8dbb468 /LLM/main/L0_MergeRequest_PR pipeline #19463 (Partly Tested) completed with status: 'ABORTED'

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25682 [ run ] triggered by Bot. Commit: bb4b1f1

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25682 [ run ] completed with state ABORTED. Commit: bb4b1f1 LLM/main/L0_MergeRequest_PR #19465 (Blue Ocean) completed with status: ABORTED

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25686 [ run ] triggered by Bot. Commit: 3c44e9f

tensorrt-cicd avatar Nov 25 '25 07:11 tensorrt-cicd

PR_Github #25686 [ run ] completed with state FAILURE. Commit: 3c44e9f /LLM/main/L0_MergeRequest_PR pipeline #19469 (Partly Tested) completed with status: 'FAILURE'

tensorrt-cicd avatar Nov 25 '25 09:11 tensorrt-cicd

PR_Github #9 [ run ] triggered by Bot. Commit: 3c44e9f

tensorrt-cicd avatar Nov 27 '25 07:11 tensorrt-cicd

PR_Github #9 [ run ] completed with state FAILURE. Commit: 3c44e9f Build Docker Images Pipeline #19709 failed

tensorrt-cicd avatar Nov 27 '25 08:11 tensorrt-cicd

PR_Github #11 [ run ] triggered by Bot. Commit: 3c44e9f

tensorrt-cicd avatar Nov 27 '25 10:11 tensorrt-cicd

PR_Github #11 [ run ] completed with state ABORTED. Commit: 3c44e9f LLM/main/L0_MergeRequest_PR #19743 (Blue Ocean) completed with status: ABORTED

tensorrt-cicd avatar Nov 27 '25 10:11 tensorrt-cicd

PR_Github #12 [ run ] triggered by Bot. Commit: 3c44e9f

tensorrt-cicd avatar Nov 27 '25 10:11 tensorrt-cicd

PR_Github #12 [ run ] completed with state ABORTED. Commit: 3c44e9f

tensorrt-cicd avatar Nov 27 '25 10:11 tensorrt-cicd

PR_Github #13 [ run ] triggered by Bot. Commit: 3c44e9f

tensorrt-cicd avatar Nov 27 '25 10:11 tensorrt-cicd

PR_Github #13 [ run ] completed with state FAILURE. Commit: 3c44e9f Build Docker Images Pipeline #2 failed

tensorrt-cicd avatar Nov 27 '25 12:11 tensorrt-cicd

PR_Github #16 [ run ] triggered by Bot. Commit: 5b70f6f

tensorrt-cicd avatar Nov 28 '25 06:11 tensorrt-cicd

PR_Github #16 [ run ] completed with state ABORTED. Commit: 5b70f6f LLM/PipelineMonitor/L0_MergeRequest_PR #3 (Blue Ocean) completed with status: ABORTED

tensorrt-cicd avatar Nov 28 '25 06:11 tensorrt-cicd