cli icon indicating copy to clipboard operation
cli copied to clipboard

Issue in Databricks Asset Bundle custom templete

Open kshrikant7 opened this issue 1 year ago • 7 comments

Issue

I am getting an issue when I try to create workflow scripts using custom template. Below is the error I am getting

template: :89: function "tasks" not defined. or function "job" not defined

I am getting this issue when I assign either of the value as below

"{{ tasks.A_task_key.values.B_task_key }}"

or

"{{job.parameters.ABC_parameter}}"

I have tried all escape methods in GoLang but for above but none of them work. I am getting this error when I try to put them in databricks_template_schema.json too.

Configuration

Create a custom template with .tmpl file extension(Do follow the https://docs.databricks.com/en/dev-tools/bundles/custom-template.html)

Steps to reproduce the behavior

  1. Run databricks bundle init dab-container-template

Expected Behavior

When I run the it should access the values from .json file and assign those values to respective variables.

Actual Behavior

Whenever I try to run the above mentioned command I am getting the following

template: :89: function "tasks" not defined. or function "job" not defined

As I mentioned above I'm getting this error when I try to add that value using databricks_template_schema.json or when I use it in any abc_job.yml.tmpl

OS and CLI version

OS : Windows Currently using Databricks CLI v0.235.0

Is this a regression?

I am getting this error in all the version, older and newer

Debug Logs

: template: :89: function "tasks" not defined 21:43:25 ERROR failed execution pid=17564 exit_code=1 error="failed to compute file content for resources/workflows/Silver_Scoring_Job.yml.tmpl. error in resources:\n jobs:\n Silver_Scoring_Job_{{.company_code}}:\n name: "Silver Scoring Job {{.company_code}}${var.workflow_env}"\n permissions:\n - level: ${var.can_view_level_permission}\n group_name: ${var.can_view_level_permission_group_name}\n - level: ${var.can_manage_run_level_permission}\n group_name: ${var.can_run_level_permission_group_name}\n - level: ${var.can_manage_run_level_permission}\n user_name: ${var.can_manage_level_permission_user_name}\n - level: ${var.can_manage_run_level_permission}\n service_principal_name: ${var.can_manage_level_permission_for_service_principal_name_1}\n tasks:\n - task_key: Final_Model_Selection\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/1 - Final Model Selection"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Data_Processing_Future_Weeks\n depends_on:\n - task_key: Final_Model_Selection\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/2 - Data Preparation for Future Weeks"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Missing_Value_Treatment\n depends_on:\n - task_key: Data_Processing_Future_Weeks\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/3 - Missing value treatment"\n
source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Croston\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.3. Scoring Croston"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: ElasticNet\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.2.1. Scoring ElasticNet"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Holt\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.2 Scoring Holt"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: SES\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.1 Scoring Simple Exponential Smoothing"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: SMA\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.4. Scoring Simple Moving Average"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: XGB\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.2.2. Scoring XGB"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n libraries:\n - pypi:\n package: numpy==1.24.0\n - task_key: RUN_ENSEMBLING\n depends_on:\n - task_key: SES\n - task_key: ElasticNet\n - task_key: Holt\n - task_key: Croston\n - task_key: SMA\n - task_key: XGB\n condition_task:\n op: EQUAL_TO\n left: "{{ tasks.Final_Model_Selection.values.Run_Ensembling }}"\n right: "true"\n - task_key: Ensembling\n depends_on:\n - task_key: RUN_ENSEMBLING\n outcome: "true"\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.3. Ensembling - future forecasts"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Model_Results_Consolidation\n depends_on:\n - task_key: Ensembling\n - task_key: RUN_ENSEMBLING\n outcome: "false"\n run_if: AT_LEAST_ONE_SUCCESS\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/6 - Model Results Consolidation"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n job_clusters:\n - job_cluster_key: ${var.silver_scoring_job_cluster_key}\n new_cluster: ${var.silver_scoring_job_cluster}\n git_source:\n git_url: ${var.git_url}\n git_provider: ${var.git_provider}\n git_branch: "${var.git_branch}"\n tags:\n env: ${var.tag_env}\n retailer: ${var.tag_retailer}\n queue:\n enabled: true\n parameters:\n - name: series_name\n default: ${var.silver_scoring_job_series_name}\n run_as:\n service_principal_name: ${var.run_as_service_principal_name}\n: template: :89: function "tasks" not defined"

kshrikant7 avatar Nov 25 '24 16:11 kshrikant7

Hi @kshrikant7, thanks for reaching out! The supported syntax in the DABs templates is the same as that for Go text templates: https://pkg.go.dev/text/template

In these templates, you refer to variable values by having a . prefix. For any fields that you define in your databricks_template_schema.json file, the reference would look something like: {{ .project_name }}. In the example you shared, {{ .project_name }} is the only key in the databricks_template_schema.json file so only that can be interpolated.

You can refer to templates here as a reference for how the syntax works: https://github.com/databricks/cli/tree/main/libs/template/templates

Could you please share why you are trying to interpolate {{ tasks.A_task_key.values.B_task_key }}? I'm not sure the databricks bundle init` command is the right one for your usecase.

shreyas-goenka avatar Nov 25 '24 21:11 shreyas-goenka

@shreyas-goenka

I have defined variables other then project_name in databricks_template_schema.json something like below

{
  "properties": {
    "bundle_name": {
      "type": "string",
      "default": "ABC",
      "description": "Bundle name",
      "order": 1
    },
    "project_name": {
      "type": "string",
      "default": "XYZ",
      "description": "Project name",
      "order": 2
    },
     "company_code": {
      "type": "string",
      "default": "ABC",
      "description": "Company Name Code",
      "order": 10
    },
    "workflow_dev_env": {
      "type": "string",
      "default": "DEV",
      "description": "Workflow environment DEV/PRD",
      "order": 10
    }
}

And I am passing the values to databricks.yml like below

variables:
     company_code:
        default: {{.company_code}}

    workflow_dev_env:
      default: {{.workflow_dev_env}}

And these values are accessed in actual workflows as we access variables

${var.variable_name}

I am able to access all the other variables in the above mentioned methods, but the issue raises only when the value contains "{{tasks.A_task_key.values.B_task_key}}" or "{{job.parameters.ABC_parameter}}"

Coming to why I am using it, here is the full workflow script

Workflow using {{job.parameters.ABC_parameter}}

resources:
  jobs:
    Alert_FTP_Ingestion_{{.company_code}}:
      name: "Alert FTP Ingestion {{.company_code}}${var.workflow_env}"
      permissions:
        - level: ${var.can_view_level_permission}
          group_name: ${var.can_view_level_permission_group_name}
        - level: ${var.can_manage_run_level_permission}
          group_name: ${var.can_run_level_permission_group_name}
        - level: ${var.can_manage_run_level_permission}
          user_name: ${var.can_manage_level_permission_user_name}
        - level: ${var.can_manage_run_level_permission}
          service_principal_name: ${var.can_manage_level_permission_for_service_principal_name_1}
      tasks:
        - task_key: skip_alert_ingestion_temp
          condition_task:
            op: EQUAL_TO
            left: "{{job.parameters.skip_alert_ingestion_temp}}"
            right: "true"
        - task_key: Export_FTP
          depends_on:
            - task_key: skip_alert_ingestion_temp
              outcome: "false"
          notebook_task:
            notebook_path: ""
            source: ${var.code_source}
          job_cluster_key: ${var.alert_FTP_ingestion_job_cluster_key}
      job_clusters:
        - job_cluster_key: ${var.alert_FTP_ingestion_job_cluster_key}
          new_cluster: ${var.alert_ftp_generation_job_cluster}
      git_source:
        git_url: ${var.git_url}
        git_provider: ${var.git_provider}
        git_branch: ${var.git_branch}
      tags:
        env: ${var.tag_env}
        retailer: ${var.tag_retailer}
      parameters:
        - name: skip_alert_ingestion_temp
          default: "${var.alert_FTP_ingestion_job_skip_alert_ingestion_temp}"
      run_as:
        service_principal_name: ${var.run_as_service_principal_name}

Workflow using {{tasks.A_task_key.values.B_task_key}}

resources:
  jobs:
    {{.project_name}}_{{.company_code}}:
      name: "{{.project_name}} {{.company_code}}${var.workflow_env}"
      email_notifications:
        on_failure:
          - ${var.on_failure_email_notification}
      schedule:
        quartz_cron_expression: ${var.schedule_quartz_cron_expression}
        timezone_id: ${var.schedule_timezone_id}
        pause_status: ${var.schedule_pause_status}    
      tasks:
        - task_key: Data_Ingestion_Job
          run_job_task:
            job_id: ${resources.jobs.Data_Ingestion_{{.company_code}}.id} 
        - task_key: Trigger_Pipelines
          depends_on:
            - task_key: Data_Ingestion_Job
          notebook_task:
            notebook_path: "notebooks/1.Data_Ingestion/NAUSWALGREEN/Trigger"
            source: ${var.code_source}
          job_cluster_key: ${var.trigger_pipeline_job_cluster_key}
          max_retries: 3
          min_retry_interval_millis: 600000
        - task_key: run_alert_ftp_ingestion
          depends_on:
            - task_key: Trigger_Pipelines
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Run_Export_FTP }}"
            right: "true"
        - task_key: Export_FTP_without_Alert
          depends_on:
            - task_key: run_alert_ftp_ingestion
              outcome: "true"
          run_job_task:
            job_id: ${resources.jobs.Alert_FTP_Ingestion_{{.company_code}}.id}
        - task_key: run_alerts
          depends_on:
            - task_key: Trigger_Pipelines
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Run_Broker_Pipeline }}"
            right: "true   
        - task_key: Alert_Generation_Job
          depends_on:
            - task_key: run_alerts
              outcome: "true" 
          run_job_task:
            job_id: ${resources.jobs.Alert_Generation_{{.company_code}}.id}
        - task_key: Alert_FTP_Ingestion
          depends_on:
            - task_key: Alert_Generation_Job
          run_job_task:
            job_id: ${resources.jobs.Alert_FTP_Ingestion_{{.company_code}}.id}
        - task_key: Refit_Run
          depends_on:
            - task_key: Alert_FTP_Ingestion
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Refit }}"
            right: "true"
        - task_key: Refit_Job
          depends_on:
            - task_key: Refit_Run
              outcome: "true"
          run_job_task:
            job_id: ${resources.jobs.Refit_{{.company_code}}.id}
        - task_key: Retrain_Run
          depends_on:
            - task_key: Alert_FTP_Ingestion
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Retrain }}"
            right: "true"
        - task_key: Retrain_Job
          depends_on:
            - task_key: Retrain_Run
              outcome: "true"
          run_job_task:
            job_id: ${resources.jobs.Retrain_{{.company_code}}.id}
      job_clusters:
        - job_cluster_key: ${var.trigger_pipeline_job_cluster_key}
          new_cluster: ${var.trigger_pipeline_job_cluster}
      git_source:
        git_url: ${var.git_url}
        git_provider: ${var.git_provider}
        git_branch: "${var.git_branch}"
      tags:
        env: ${var.tag_env}
        retailer: ${var.tag_retailer}
      queue:
        enabled: true
      run_as:
        service_principal_name: ${var.run_as_service_principal_name}

kshrikant7 avatar Nov 26 '24 04:11 kshrikant7

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

github-actions[bot] avatar Jan 02 '25 13:01 github-actions[bot]

I have the same issue currently where I am trying to use a custom template to define some basic parameters and tags that must be included in all of our bundles.

The way to access the Job ID for example is to use {{ job.id }}.

However, when I add that to my template, it fails with the error: template: :61: function "job" not defined

It seems the template engine is interpreting that as Go language for variables during bundle creation time, instead of the actual string that is required.

Has there been any other updates on this?

DMcGhee94 avatar Mar 20 '25 13:03 DMcGhee94

Hi @DMcGhee94 , you can the following format which I'm using currently

"{{"{{"}}job.parameters.ABC{{"}}"}}"

which will result in {{job.parameters.ABC}}

Hope this helps.

kshrikant7 avatar Mar 20 '25 17:03 kshrikant7

Hi @kshrikant7 ,

Your solution worked great, allowed me to bake them into a custom bundle.

The syntax is a bit crazy to look at so a cleaner solution would obviously be ideal, how did you happen to come across this if you don't mind me asking?

DMcGhee94 avatar Mar 27 '25 19:03 DMcGhee94

@DMcGhee94 I don't remember exactly how I got this solution, but I think someone on this issue suggested to use that format.

kshrikant7 avatar Mar 27 '25 19:03 kshrikant7

Closing this issue since the thread contains methods to successfully escape jobs parameter syntax. Please reopen if there are any concerns that were not addressed.

shreyas-goenka avatar Jun 02 '25 05:06 shreyas-goenka