java
java copied to clipboard
Big NDArray generate Operand Tensor meet protobuf exceeded maximum protobuf size of 2GB ?
Hi : from spark DataFrame generate org.tensorflow.ndarray.DoubleNdArray , after I want to generate Operand[TFloat64] tensor , meet error
scala> val featureVector = SparkConverter.sparkDataframeFeatureVectorConvertTfTensor(finalInputDf,"final_features" )
featureVector: org.tensorflow.ndarray.DoubleNdArray = org.tensorflow.ndarray.impl.dense.DoubleDenseNdArray@e3f6a6a0
scala> val ft = tf.constant(featureVector)
[libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/message_lite.cc:451] tensorflow.AttrValue exceeded maximum protobuf size of 2GB: 6279090916
org.tensorflow.exceptions.TFInvalidArgumentException: AttrValue missing value with expected type 'tensor'
for attr 'value'
; NodeDef: {{node Const}}; Op<name=Const; signature= -> output:dtype; attr=value:tensor; attr=dtype:type>
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:87)
at org.tensorflow.EagerOperationBuilder.execute(EagerOperationBuilder.java:314)
at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:77)
at org.tensorflow.EagerOperationBuilder.build(EagerOperationBuilder.java:64)
at org.tensorflow.op.core.Constant.create(Constant.java:1350)
at org.tensorflow.op.core.Constant.tensorOf(Constant.java:521)
at org.tensorflow.op.Ops.constant(Ops.java:1669)
... 59 elided
but if I filter some small part Dataframe is ok
scala> val featureVector = SparkConverter.sparkDataframeFeatureVectorConvertTfTensor(finalInputDf.filter(col("pay_status").equalTo(1)),"final_features" )
featureVector: org.tensorflow.ndarray.DoubleNdArray = org.tensorflow.ndarray.impl.dense.DoubleDenseNdArray@627077a
scala> val ft_small = tf.constant(featureVector)
ft_small: org.tensorflow.op.core.Constant[org.tensorflow.types.TFloat64] = <Const 'Const_2'>
scala> ft_small.asTensor().numBytes()
res43: Long = 1058424696
need I have to split the DoubleNdArray to some part ? or we have another way to convert it to Operand[T]?
I found we have java.util.Spliterator
scala> featureVector.shape
res46: org.tensorflow.ndarray.Shape = [900021, 147]
scala> featureVector.scalars()
res47: org.tensorflow.ndarray.NdArraySequence[org.tensorflow.ndarray.DoubleNdArray] = org.tensorflow.ndarray.impl.sequence.FastElementSequence@f9698af
scala> featureVector.scalars().spliterator
res48: java.util.Spliterator[org.tensorflow.ndarray.DoubleNdArray] = java.util.Spliterators$IteratorSpliterator@bdc74838
scala> featureVector.scalars().spliterator.trySplit
res49: java.util.Spliterator[org.tensorflow.ndarray.DoubleNdArray] = java.util.Spliterators$ArraySpliterator@f4f92e6e