onnxruntime-gpu, cudaoptions, result is different

Open snow-tyan opened this issue 3 years ago • 0 comments

Describe the issue

When I use onnxruntime for inference, I found that the results of each run are different (to about 4 decimal places) after cuda options are used. Is this the accuracy problem caused by the copy from CPU to GPU? I used a for loop to verify this conjecture. I found that the position of the for loop would be like this before the session, but NOT after the session was created... Of course, if you turn off cuda and use cpu inference, the above problems will not occur in any case.

To reproduce

    Ort::Env env(ORT_LOGGING_LEVEL_WARNING, "meter_recon");
    Ort::SessionOptions session_options;
    session_options.SetIntraOpNumThreads(NUM_THREADS);
    session_options.SetGraphOptimizationLevel(GraphOptimizationLevel::ORT_ENABLE_EXTENDED);
#define USE_CUDA
#ifdef USE_CUDA
    AF_INFO("USE CUDA, DEVICE_ID={:d}", DEVICE_ID);
    // OrtCUDAProviderOptions cuda_options;
    // cuda_options.device_id = DEVICE_ID;
    // cuda_options.cudnn_conv_algo_search = OrtCudnnConvAlgoSearchExhaustive;
    // cuda_options.gpu_mem_limit = std::numeric_limits<size_t>::max();
    // cuda_options.arena_extend_stracudategy = 0;
    // cuda_options.do_copy_in_default_stream = true;
    // session_options.AppendExecutionProvider_CUDA(cuda_options);
    OrtSessionOptionsAppendExecutionProvider_CUDA(session_options, DEVICE_ID);
#endif
for(int jjjj = 0; jjjj < 10; ++jjjj){
    Ort::Session session(env, model_path.c_str(), session_options);
    auto height = session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[2];
    auto width = session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo().GetShape()[3];

    // Load Image
    cv::Mat in_mat = cv::imread(input_image.string());
...
}

Urgency

No response

Platform

Linux

OS Version

Docker: Ubuntu20.04

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.11.1-gpu-linux

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU, CUDA

Execution Provider Library Version

CUDA 11.5, cuDNN8.3

Sep 23 '22 02:09 snow-tyan