benchmarks CIFAR10 data loading does not work with python 3

I'm running tf_cnn_benchmarks.py with python 3 with the resnet56 model and CIFAR10 data and am getting an error during the data loading:

Traceback (most recent call last):
  File "tf_cnn_benchmarks.py", line 60, in <module>
    app.run(main)  # Raises error on invalid flags, unlike tf.app.run()
  File "/anaconda3/envs/tf_19_py3/lib/python3.6/site-packages/absl/app.py", line 300, in run
    _run_main(main, args)
  File "/anaconda3/envs/tf_19_py3/lib/python3.6/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "tf_cnn_benchmarks.py", line 56, in main
    bench.run()
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1516, in run
    return self._benchmark_train()
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1633, in _benchmark_train
    build_result = self._build_graph()
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 1672, in _build_graph
    (input_producer_op, enqueue_ops, fetches) = self._build_model()
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 2138, in _build_model
    input_producer_stages) = self._build_input_processing(shift_ratio=0)
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/benchmark_cnn.py", line 2079, in _build_input_processing
    shift_ratio=shift_ratio)
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/preprocessing.py", line 626, in minibatch
    all_images, all_labels = dataset.read_data_files(subset)
  File "/Users/dmsuehir/kube/dmsuehir_forks/benchmarks/scripts/tf_cnn_benchmarks/datasets.py", line 122, in read_data_files
    inputs.append(cPickle.load(f))
  File "/anaconda3/envs/tf_19_py3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 132, in read
    pywrap_tensorflow.ReadFromStream(self._read_buf, length, status))
  File "/anaconda3/envs/tf_19_py3/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 100, in _prepare_value
    return compat.as_str_any(val)
  File "/anaconda3/envs/tf_19_py3/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 107, in as_str_any
    return as_str(value)
  File "/anaconda3/envs/tf_19_py3/lib/python3.6/site-packages/tensorflow/python/util/compat.py", line 80, in as_text
    return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

This issue is similar to the TensorFlow issue found here due to differences in python2 vs python3: https://github.com/tensorflow/tensorflow/issues/11312

I'll post a PR with a fix for this that will work with both python2 and python3.

Aug 28 '18 18:08 dmsuehir

Thanks @dmsuehir , Due to some "magic" internally we do not see Python3 issues and greatly appreciate when they are found and fixed. The team is guilty (maybe) of mostly using Python2. Thank you again and looking forward to the PR.

Aug 28 '18 18:08 tfboyd

@tfboyd I posted the PR. 😄

Aug 28 '18 19:08 dmsuehir