spark-perf icon indicating copy to clipboard operation
spark-perf copied to clipboard

Input Data File Location

Open himanshurajput2 opened this issue 9 years ago • 1 comments

Hello,

I am working on spark on yarn setup and running k-means algorithm. I want to know the location of the input data file generated by spark-perf or it is in memory only?

Thanks

himanshurajput2 avatar Aug 01 '16 23:08 himanshurajput2

Hi, I have the same question. It seems the data will be read from/written to the HDFS specified in config.py. But I didn't see any files created in HDFS during the test. Is the input dataset created on-the-fly, or do we need to populate the datasets in HDFS before running the test? If it is the latter, anyone knows where the test datasets are? Thx!

dcvan24 avatar Feb 20 '17 17:02 dcvan24