feathers icon indicating copy to clipboard operation
feathers copied to clipboard

Seems to require added the streaming jar to the hadoop classpath

Open dgleich opened this issue 15 years ago • 1 comments

In order to use this jar with hadoop streaming on a standalone installation of Cloudera CDH3 on Ubuntu Linux, I had to do two things:

  1. change the ant file to add:

+++ b/build.xml @@ -21,6 +21,8 @@

  •    <fileset dir="${hadoop.home}" 
    
  •             includes="contrib/streaming/hadoop-streaming-*.jar" />
    
  1. add the selected hadoop streaming jar to the HADOOP_CLASSPATH.

In dumbo/backends/streaming.py, I added: if addedopts['libjarstreaming'] and addedopts['libjarstreaming'][0] != 'no': addedopts['libjar'].append(streamingjar) which seemed to be required to get it to work.

Without this, I always got an error that it couldn't figure out where where org.apache.hadoop.typedbytes.TypedBytesWritable was for the Partition function:

Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/typedbytes/TypedBytesWritable at fm.last.feathers.partition.Prefix.(Unknown Source)

After that, I was able to do use the partition/Prefix class successfully.

dgleich avatar Sep 28 '10 19:09 dgleich

Thanks for the tip!

treystout avatar May 27 '11 18:05 treystout