google-cloud-python icon indicating copy to clipboard operation
google-cloud-python copied to clipboard

Configuration YAML for changing table description text not working in my case

Open steve-ahlswede opened this issue 2 years ago • 2 comments

As the title suggests, I am not able to get a changed table description text when I perform a translation of a HIVEQL ddl file. The original description text persists in the translated version. I do not have the same problem when I try to use other transformation config.yaml files, such as one where I set external: false to avoid CREATE TABLE from becoming CREATE EXTERNAL TABLE. Details on how to reproduce below.

Environment details

  • OS type and version: MacOS Monterey 12.6
  • Python version: 3.9
  • pip version: 22.0.4 (I use poetry v1.4.2 if that makes a difference)
  • google-cloud-bigquery-migration version: 0.11.0

Steps to reproduce

  1. Create a "source" .ddl file with the following contents
CREATE TABLE `example.table`(
`col1` bigint, 
`col2` bigint, 
`col3` boolean, 
`col4` timestamp, 
`col5` bigint, 
`col6` bigint)
COMMENT 'email: [email protected]'
ROW FORMAT SERDE 
'org.apache.hadoop.hive.ql.io.orc.OrcSerde' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
LOCATION
'hdfs://path/to/my/files'
TBLPROPERTIES (
'_artifact_placeholder'='{{ artifact:holder }}', 
'bucketing_version'='2', 
'transient_lastDdlTime'='1682664243')
  1. Add a file named change-description.config.yaml with the following contents
type: object_rewriter
relation:
  - match:
      db: my-gcp-project
      schema: example
    description:
      text: "example"
  1. Move these files to a bucket on GCP inside a 'folder' named folder_containing_ddl_and_config_file

  2. Execute a migration workflow

MY_GCP_PROJECT = 'my-gcp-project'
BASE_GCS_BUCKET = 'my-bucket-name'
gcs_input_full_path = f"gs://{BASE_GCS_BUCKET}/folder_containing_ddl_and_config_file"
gcs_output_full_path = f"gs://{BASE_GCS_BUCKET}/folder_containing_ddl_and_config_file_translated"

parent = f"projects/{MY_GCP_PROJECT}/locations/eu"
client = bigquery_migration_v2.MigrationServiceClient()

source_dialect = bigquery_migration_v2.Dialect()
source_dialect.hiveql_dialect = bigquery_migration_v2.HiveQLDialect()
target_dialect = bigquery_migration_v2.Dialect()
target_dialect.bigquery_dialect = bigquery_migration_v2.BigQueryDialect()
translation_env = bigquery_migration_v2.SourceEnv(
      default_database=MY_GCP_PROJECT, schema_search_path='example'
)

translation_config = bigquery_migration_v2.TranslationConfigDetails(
      gcs_source_path=gcs_input_full_path,
      gcs_target_path=gcs_output_full_path,
      source_dialect=source_dialect,
      target_dialect=target_dialect,
      source_env=translation_env,
)
migration_task = bigquery_migration_v2.MigrationTask(
      type_="Translation_HiveQL2BQ", translation_config_details=translation_config
)
workflow = bigquery_migration_v2.MigrationWorkflow(display_name=job_name)
workflow.tasks["translation"] = migration_task

request = bigquery_migration_v2.CreateMigrationWorkflowRequest(
      parent=parent, migration_workflow=workflow
)
response = client.create_migration_workflow(request=request)
  1. Check the resulting file. In my case it looks like below, where the description text is unchanged from what it was in the original file.
-- Translation time: 2023-05-10T09:28:40.261922Z
-- Translation job ID: 28288a12-c0ba-1bd2-8c2e-d640b91c3b5b
-- Source: my-gcp-bucket/folder_containing_ddl_and_config_file/step_1_ddl_file.ddl
-- Translated from: Hive
-- Translated to: BigQuery

CREATE EXTERNAL TABLE `my-gcp-project`.example.table
(
  col1 INT64,
  col2 INT64,
  col3 BOOL,
  col4 DATETIME,
  col5 INT64,
  col6 INT64
)
OPTIONS(
  description='email: [email protected]',
  format='ORC',
  uris=[
    'hdfs://path/to/my/files'
  ]
);

Thanks!

steve-ahlswede avatar May 10 '23 09:05 steve-ahlswede

I'm going to transfer this issue to the google-cloud-python repository as we are preparing to move the code in this repository to google-cloud-python in the next 1-2 weeks.

parthea avatar Aug 15 '23 00:08 parthea

Hi, @steve-ahlswede . Were you able to resolve this issue?

vchudnov-g avatar Apr 30 '24 18:04 vchudnov-g

I'm closing this issue since we haven't gotten a response. If this is still happening, go through the steps in our SUPPORT page to ensure you report the issue in the place that will give you the promptest response.

vchudnov-g avatar Jun 24 '24 19:06 vchudnov-g