eel-sdk icon indicating copy to clipboard operation
eel-sdk copied to clipboard

Need an example of creating DDL for a Hive Parquet table with EEL

Open hannesmiller opened this issue 8 years ago • 0 comments

  • CSVSource to HiveSink
val schema = AvroSchemaFns.fromAvroSchema(new Schema.Parser().parse(new File("user.avsc")))
CsvSource(path)
  .withSchema(schema)
  .to(HiveSink("mydatabase", "myTable"))
  • Table field: fname, lname, age, salary
  • 2 partition keys of country and city
object EelCreateTableExample extends App {
  val crateTableCommand = HiveDDL.showDDL(
    tableName = "mydatabase.mytable",
    partitions = Seq(
      PartitionColumn("country", StringType),
      PartitionColumn("city", StringType)
    ),
    fields = Seq(
      Field("fname", StringType),
      Field("lname", StringType),
      Field("age", IntType.Signed),
      Field("salary", DecimalType(38, 5))
    ),
    tableType = TableType.EXTERNAL_TABLE,
    location = Some("hdfs://nameservice1/blah/mytable_location"),
    serde = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe",
    inputFormat = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat",
    outputFormat = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat",
    props = Map.empty,
    tableComment = Some("my lovely table"),
    ifNotExists = true
  )
  println(crateTableCommand)
}
  • Ouput:
CREATE EXTERNAL TABLE IF NOT EXISTS `mydatabase.mytable` (
   `fname` string,
   `lname` string,
   `age` int,
   `salary` decimal(38,5))
PARTITIONED BY (
   `country` string,
   `city` string)
ROW FORMAT SERDE
   'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
OUTPUTFORMAT
   'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
LOCATION 'hdfs://nameservice1/blah/mytable_location'

hannesmiller avatar Mar 01 '17 12:03 hannesmiller