Siamese-RPN-tensorflow icon indicating copy to clipboard operation
Siamese-RPN-tensorflow copied to clipboard

loss=nan

Open wml666 opened this issue 6 years ago • 3 comments

按照您所说的方式,采用vot数据进行训练,出现loss=nan的情况 step = 0,loss = nan,cls_loss = 0.694078922272,reg_loss = nan,lr = 0.0010000000475,time = 2.76396989822

wml666 avatar Mar 19 '19 08:03 wml666

@wml666 您好,请问你是用vot2016吗?我使用vot2016总是在训练过程出现错误,显示为: tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]] [[Node: batch/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_181_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "train.py", line 106, in t.train() File "train.py", line 74, in train sess.run([train_op,loss,cls_loss,reg_loss,lr,debug_pre_cls,debug_pre_reg,debug_pre_score,debug_pre_box,label,target_box,detection,gt_box]) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run run_metadata_ptr) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1100, in _run feed_dict_tensor, options, run_metadata) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run run_metadata) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]] [[Node: batch/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_181_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

Caused by op 'batch', defined at: File "train.py", line 106, in t.train() File "train.py", line 31, in train template,,detection,gt_box,,_=self.reader.get_batch() File "/home/h/user/Siamese-RPN-tensorflow-master/utils/image_reader_cuda.py", line 182, in get_batch batch_size,num_threads=32,capacity=2048,shapes=[(127,127,3),(4),(255,255,3),(4),(2),()]) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 988, in batch name=name) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 762, in _batch dequeued = queue.dequeue_many(batch_size, name=name) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 476, in dequeue_many self._queue_ref, n=n, component_types=self._dtypes, name=name) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 3480, in queue_dequeue_many_v2 component_types=component_types, timeout_ms=timeout_ms, name=name) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func return func(*args, **kwargs) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op op_def=op_def) File "/home/h/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 1, current size 0) [[Node: batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]] [[Node: batch/_497 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_181_batch", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]] 谢谢!

sing-hui avatar May 06 '19 12:05 sing-hui

我和你们遇到同样问题,请问你们怎么解决的??

zhaoym55 avatar Jul 15 '19 07:07 zhaoym55

@sing-hui 请问你这个问题怎么解决的

liangliu123 avatar Nov 21 '19 11:11 liangliu123