cobrix
cobrix copied to clipboard
.option("input_split_size_mb", 100)
Background [Optional]
This parameter( input_split_size_mb) works for only VariableLengthParameters.
Question
will able to use this input_split_size_mb fixed length files?
Fixed length files are split using Spark's binaryRecords() API (https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.SparkContext.binaryRecords.html)
which does not require tuning and is high performant and scalable, so input_split_size_mb is not used.
There are certain exceptions, but I'm not sure if your use case is related to this.
Which code snippet you use to load fixed-length files?
Sorry for late response, Question no:#521