Pixel2Mesh icon indicating copy to clipboard operation
Pixel2Mesh copied to clipboard

InvalidArgumentError (see above for traceback): indices[0,7] = -1 is not in [0, 157)

Open abadcd opened this issue 7 years ago • 11 comments

When I downloaded the ShapeNet.tar file, I decompressed it according to the document prompt. But because of my computer performance problems, I did not use ShapeNet.tar to perform Python train.py operations. I only took part of the extracted content of ShapeNet.tar for training. But when I executed Python train.py, I encountered the following error, which has been bothering me. If you can, please help me.

Model restored from file: utils/checkpoint/gcn.ckpt Traceback (most recent call last): File "train.py", line 101, in _, dists,out1,out2,out3 = sess.run([model.opt_op,model.loss,model.output1,model.output2,model.output3], feed_dict=feed_dict) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 877, in run run_metadata_ptr) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1100, in _run feed_dict_tensor, options, run_metadata) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run run_metadata) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0,7] = -1 is not in [0, 157) [[Node: GatherV2_12 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat_2, strided_slice_15, gradients/graphconvolution_1/SparseTensorDenseMatMul/SparseTensorDenseMatMul_grad/GatherV2/axis)]]

Caused by op u'GatherV2_12', defined at: File "train.py", line 47, in model = GCN(placeholders, logging=True) File "build/bdist.linux-x86_64/egg/pixel2mesh/models.py", line 108, in init self.build() File "build/bdist.linux-x86_64/egg/pixel2mesh/models.py", line 74, in build self._loss() File "build/bdist.linux-x86_64/egg/pixel2mesh/models.py", line 118, in _loss self.loss += .2*laplace_loss(self.inputs, self.output1, self.placeholders, 1) File "build/bdist.linux-x86_64/egg/pixel2mesh/losses.py", line 16, in laplace_loss lap1 = laplace_coord(pred1, placeholders, block_id) File "build/bdist.linux-x86_64/egg/pixel2mesh/losses.py", line 10, in laplace_coord laplace = tf.reduce_sum(tf.gather(vertex, indices), 1) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 2659, in gather return gen_array_ops.gather_v2(params, indices, axis, name=name) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3142, in gather_v2 "GatherV2", params=params, indices=indices, axis=axis, name=name) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op op_def=op_def) File "/root/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1717, in init self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): indices[0,7] = -1 is not in [0, 157) [[Node: GatherV2_12 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT32, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](concat_2, strided_slice_15, gradients/graphconvolution_1/SparseTensorDenseMatMul/SparseTensorDenseMatMul_grad/GatherV2/axis)]]

abadcd avatar Jan 04 '19 09:01 abadcd

I have the same issue, have you fixed it?

hguomin avatar Feb 19 '19 13:02 hguomin

@hguomin https://github.com/nywang16/Pixel2Mesh/issues/23

abadcd avatar Feb 20 '19 03:02 abadcd

@hguomin #23

@abadcd Thank you!!!

hguomin avatar Feb 20 '19 06:02 hguomin

I met the same problem and installed the version of GPU.However,it is still not work.Any one can help me?

Weipeilang avatar May 18 '19 16:05 Weipeilang

try restarting your machine after installing the GPU drivers. it might help.

gopikrishnachaganti avatar May 18 '19 17:05 gopikrishnachaganti

Thank you for replying.I did what you said,but it didn't work at all.Do you still have other ways?Or can you send me you code?

------------------ 原始邮件 ------------------ 发件人: "gopikrishnachaganti"[email protected]; 发送时间: 2019年5月19日(星期天) 凌晨1:13 收件人: "nywang16/Pixel2Mesh"[email protected]; 抄送: "If you,if I"[email protected];"Comment"[email protected]; 主题: Re: [nywang16/Pixel2Mesh] InvalidArgumentError (see above fortraceback): indices[0,7] = -1 is not in [0, 157) (#39)

try restarting your machine after installing the GPU drivers. it might help.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

Weipeilang avatar May 19 '19 15:05 Weipeilang

@Weipeilang I just install end the GPU version of tensorflow,but it still having this problem. Have you solved it yet?

zjxyz avatar Jun 27 '19 02:06 zjxyz

As is explained in the tensorflow document:Note that on CPU, if an out of bound index is found, an error is returned. On GPU, if an out of bound index is found, a 0 is stored in the corresponding output value.

And it also discuss on this issue .

One solution of this problem is changing the function project in the layers.py as follows:

def project(img_feat, x, y, dim):
    x1 = tf.floor(x)
    x2 = tf.ceil(x)
    y1 = tf.floor(y)
    y2 = tf.ceil(y)

    # add a padding to img_feat
    paddings = tf.constant([[0, 1], [0, 1], [0, 0]])
    img_feat = tf.pad(img_feat, paddings, 'CONSTANT')

    Q11 = tf.gather_nd(img_feat, tf.stack([tf.cast(x1, tf.int32), tf.cast(y1, tf.int32)], 1))
    Q12 = tf.gather_nd(img_feat, tf.stack([tf.cast(x1, tf.int32), tf.cast(y2, tf.int32)], 1))
    Q21 = tf.gather_nd(img_feat, tf.stack([tf.cast(x2, tf.int32), tf.cast(y1, tf.int32)], 1))
    Q22 = tf.gather_nd(img_feat, tf.stack([tf.cast(x2, tf.int32), tf.cast(y2, tf.int32)], 1))

    weights = tf.multiply(tf.subtract(x2, x), tf.subtract(y2, y))
    Q11 = tf.multiply(tf.tile(tf.reshape(weights, [-1, 1]), [1, dim]), Q11)

    weights = tf.multiply(tf.subtract(x, x1), tf.subtract(y2, y))
    Q21 = tf.multiply(tf.tile(tf.reshape(weights, [-1, 1]), [1, dim]), Q21)

    weights = tf.multiply(tf.subtract(x2, x), tf.subtract(y, y1))
    Q12 = tf.multiply(tf.tile(tf.reshape(weights, [-1, 1]), [1, dim]), Q12)

    weights = tf.multiply(tf.subtract(x, x1), tf.subtract(y, y1))
    Q22 = tf.multiply(tf.tile(tf.reshape(weights, [-1, 1]), [1, dim]), Q22)

    outputs = tf.add_n([Q11, Q21, Q12, Q22])
    return outputs

I think may be the projection of mesh vertex onto feature maps is out of the index of feature maps. And the GPU would fill zero automatically,while the CPU throw an error.

zhongjinluo avatar Aug 28 '19 07:08 zhongjinluo

Solved it thanks to tensorflow gpu. Here is what I did: conda create --name pixmesh tensorflow-gpu=1 python=2.7 then pip install tflearn

Sayan-m90 avatar Jun 25 '20 06:06 Sayan-m90

Hi, I am trying to run this on colab, and when I use python2 -m pip install tensorflow==1.3.0, I got the same issue, and when I tried to use python2 -m pip install tensorflow-gpu==1.3.0, I got a new error like the following, so does anyone succeeded in training this model and know how to solve this problem, or what version of TensorFlow I should use, thanks:

Traceback (most recent call last): File "train.py", line 17, in import tensorflow as tf File "/usr/local/lib/python2.7/dist-packages/tensorflow/init.py", line 24, in from tensorflow.python import * File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/init.py", line 49, in from tensorflow.python import pywrap_tensorflow File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 52, in raise ImportError(msg) ImportError: Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in from tensorflow.python.pywrap_tensorflow_internal import * File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in _pywrap_tensorflow_internal = swig_import_helper() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description) ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace above this error message when asking for help.

lzw365-code avatar Apr 13 '21 05:04 lzw365-code

@lzw365-code I run into the same issue and was able to resolve it by install cudnn6 (below is the only direct download link i found):

!wget https://developer.download.nvidia.com/compute/redist/cudnn/v6.0/libcudnn6_6.0.20-1+cuda7.5_amd64.deb !sudo apt install ./libcudnn6_6.0.20-1+cuda7.5_amd64.deb

After installing cudnn I run into the same problem again. The solution is to remove the 22th line in the train.py file which selects the cuda device

os.environ['CUDA_VISIBLE_DEVICES'] = '1'

The line caused cuda to not find a valid device and therefore fall back to cpu functionality.

Michael-H1302 avatar Nov 26 '22 22:11 Michael-H1302