graphics icon indicating copy to clipboard operation
graphics copied to clipboard

Error in CvxNet eval.py: You must feed a value for placeholder tensor 'Placeholder_3' with dtype float

Open zohaibmohammad opened this issue 5 years ago • 7 comments

Hi, I have trained the model for RGB-to-3D and Depth-to-3D. Now, I am running the eval.py for evaluation of the trained models. However, there is an error for both the cases as shown below;

tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'Placeholder_3' with dtype float [[node Placeholder_3 (defined at /code/cvxnet/graphics/tensorflow_graphics/projects/cvxnet/eval.py:67) ]]

I did not change any parameters. Can anyone please let me know what is the error here? I am new in tensorflow :) Thanks in advance.

-mz

zohaibmohammad avatar Jul 29 '20 11:07 zohaibmohammad

Can you please tell us what command did you use to run the evaluation job?

Awcrr avatar Jul 30 '20 20:07 Awcrr

Thanks for your reply. I was using save_summaries_steps=3 and log_step_count_steps=3. After setting these parameters to None, the placeholder error has solved. Now there is another error in evaluation. Here is the details;

2020-08-01 22:07:07.690117: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-08-01 22:08:09.111455: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

Traceback (most recent call last):
  File "/home/mz/anaconda3/envs/cvxNet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1365, in _do_call
    return fn(*args)
  File "/home/mz/anaconda3/envs/cvxNet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "/home/mz/anaconda3/envs/cvxNet/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1443, in _call_tf_sessionrun
    run_metadata)


tensorflow.python.framework.errors_impl.InvalidArgumentError: 2 root error(s) found.

  (0) Invalid argument: Determined shape must either match input shape along split_dim exactly if fully specified, or be less than the size of the input along split_dim if not fully specified.  Got: 3950
	 [[{{node split}}]]
	 [[add_5/_675]]
  (1) Invalid argument: Determined shape must either match input shape along split_dim exactly if fully specified, or be less than the size of the input along split_dim if not fully specified.  Got: 3950
	 [[{{node split}}]]

0 successful operations.
0 derived errors ignored.


The batch size is 16. The commond I am using to evaluate is python eval.py --train_dir=/tmp/cvxnet/models/depth --data_dir=/tmp/cvxnet_data --n_half_palnes=50 --extract_mesh

Where, train_dir contains trained model (checkpoint, 3 model.* files) and data_dir has the mini dataset (as provided with code).

zohaibmohammad avatar Aug 01 '20 20:08 zohaibmohammad

Glad you resolved the previous issue.

For the new error, just change --n_half_palnes to --n_half_planes. I guess you probably just copy pasted the command from the previous README so this is actually our fault. We are really sorry that we had this typo. Just making this change should fix the issue here. We also fixed this typo in the README last week.

Awcrr avatar Aug 05 '20 18:08 Awcrr

Hi Boyang Deng (@Awcrr ),

Thanks for your reply. I have updated the parameter as you suggested. However, still there is an error which may be caused due to low GPU memory. I am running the code on 4GB GPU. Will you please see the error given below and suggest how could I resolve it?

2020-08-06 13:18:24.804314: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-08-06 13:19:38.265769: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
+2020-08-06 13:19:49.306563: W tensorflow/core/common_runtime/bfc_allocator.cc:434] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.79GiB (rounded to 3001500160)
Current allocation summary follows.  

.
.
.
2020-08-06 13:19:49.308971: I tensorflow/core/common_runtime/bfc_allocator.cc:1010] Stats: 
Limit:                  3262513152
InUse:                   203106304
MaxInUse:               1315410176
NumAllocs:                     366
MaxAllocSize:           1162108928

2020-08-06 13:19:49.308995: W tensorflow/core/common_runtime/bfc_allocator.cc:439] *****_***___________________________________________________________________________________________
2020-08-06 13:19:49.309048: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at tile_ops.cc:220 : Resource exhausted: OOM when allocating tensor with **shape[1,50,50,100050,3]** and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):

kindly consider the part of error, shape[1,50,50,100050,3]. The error is same if I reduce the batch size. I tried with batch size 32, 16, 8, 2 and 1.

zohaibmohammad avatar Aug 06 '20 11:08 zohaibmohammad

Long story short, you can change the 1e5 number here to a small enough number that can fit into your GPU memory.

We use an 8G memory GPU for our experiments so we selected this 1e5 number. The batch_size number won't affect this as we only use 1 for mesh extraction.

Awcrr avatar Aug 06 '20 16:08 Awcrr

Hi, Thanks for your help. I am working on it. Can you please share complete transformed dataset in tfrecord format? Or, if possible, can you share the way of conversion of dataset from shapenet to tfrecord?

zohaibmohammad avatar Aug 06 '20 16:08 zohaibmohammad

Long story short, you can change the 1e5 number here to a small enough number that can fit into your GPU memory.

We use an 8G memory GPU for our experiments so we selected this 1e5 number. The batch_size number won't affect this as we only use 1 for mesh extraction.

Hi @Awcrr, The code (training and testing) is working proper on 12G memory GPU for given sample_dataset composed of only one category - telephone. Thanks for your help.

Can you please share complete dataset or give me a clue how to create it?

zohaibmohammad avatar Aug 11 '20 08:08 zohaibmohammad