stan_csv_reader is sensitive (crashes) to super small floats in sampling data
Summary:
stansummary crashes when run on a csv file that contains 1.5316e-322 in the accept_stat__ column of the csv file.
Description:
[Copied from comment below] The problem seems to be that the Boost lexical_cast function can only deal with normal double values; 1.6e-322 is too small to be represented as normal number---here are the double-precision numerical limits.
The Boost doc only suggests using std::stringstream for more control.
Reproducible Steps:
My cmdstan code runs without warnings or errors, but when I run stansummary on the resulting csv file I get the error:
prange% /usr/local/cmdstan-2.17.1/bin/stansummary hyper_samples_1.csv
libc++abi.dylib: terminating with uncaught exception of type boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_lexical_cast> >: bad lexical cast: source type value could not be interpreted as target
Abort
In order to trace the source of the error, I proceeded to delete sample-output lines from the csv file until I found the following offending line:
-27.7811,1.5316e-322,0.0547959,1,1,0,37.0581,0.718519,-1.14652,0.133214,2.21245,0.938323,-1.35562,-0.222546,0.970578,-0.274166,-1.11156,0.119766,-0.132146,-0.022578,0.229995,-0.0303289,-0.23872,0.0410686,0.0588429,-4.76369e-05,-0.18922,-18.4997
When I change the number 1.5316e-322 to 1.5316e-3, stansummary works again.
Run stansummary of the following csv file. I edited it down to the offending sample-output line:
# stan_version_major = 2
# stan_version_minor = 17
# stan_version_patch = 1
# model = hyper_model
# method = sample (Default)
# sample
# num_samples = 1000 (Default)
# num_warmup = 1000 (Default)
# save_warmup = 0 (Default)
# thin = 1 (Default)
# adapt
# engaged = 1 (Default)
# gamma = 0.050000000000000003 (Default)
# delta = 0.80000000000000004 (Default)
# kappa = 0.75 (Default)
# t0 = 10 (Default)
# init_buffer = 75 (Default)
# term_buffer = 50 (Default)
# window = 25 (Default)
# algorithm = hmc (Default)
# hmc
# engine = nuts (Default)
# nuts
# max_depth = 10 (Default)
# metric = diag_e (Default)
# metric_file = (Default)
# stepsize = 1 (Default)
# stepsize_jitter = 1
# id = 1
# data
# file = hyper_1.data.R
# init = hyper_1.init.R
# random
# seed = 3959268712
# output
# file = hyper_samples_1.csv
# diagnostic_file = (Default)
# refresh = 100 (Default)
lp__,accept_stat__,stepsize__,treedepth__,n_leapfrog__,divergent__,energy__,z.1
# Adaptation terminated
# Step size = 0.0547959
# Diagonal elements of inverse mass matrix:
# 0.713329, 1.08275, 0.596216, 0.607337, 0.289338, 0.159566, 0.477, 0.425896, 0.603512, 2.59539
-27.7811,1.5316e-322,0.0547959,1,1,0,37.0581,0.718519
-25.3671,0.86028,0.0547959,4,22,1,30.5591,0.690161
#
# Elapsed Time: 6.95684 seconds (Warm-up)
# 2.52359 seconds (Sampling)
# 9.48043 seconds (Total)
#
I added a second sampling-output line below the offending line so stansummary will work when 1.5316e-322 is changed to 1.5316e-3
Current Output:
prange% /usr/local/cmdstan-2.17.1/bin/stansummary hyper_samples_1.csv
libc++abi.dylib: terminating with uncaught exception of type boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::bad_lexical_cast> >: bad lexical cast: source type value could not be interpreted as target
Abort
Expected Output:
When the offending number in the sampling output is changed from 1.5316e-322 to 1.5316e-3, the output is
prange% /usr/local/cmdstan-2.17.1/bin/stansummary tmp
Inference for Stan model: hyper_model
1 chains: each with iter=(2); warmup=(0); thin=(1); 2 iterations saved.
Warmup took (7.0) seconds, 7.0 seconds total
Sampling took (2.5) seconds, 2.5 seconds total
Mean MCSE StdDev 5% 50% 95% N_Eff N_Eff/s R_hat
lp__ -27 1.2 1.7 -2.8e+01 -25 -25 2.0 0.79 nan
accept_stat__ 0.43 0.43 0.61 1.5e-03 0.86 0.86 2.0 0.79 nan
stepsize__ 0.055 0.00 0.00 5.5e-02 0.055 0.055 2.0 0.79 nan
treedepth__ 2.5 1.5 2.1 1.0e+00 4.0 4.0 2.0 0.79 nan
n_leapfrog__ 12 10 15 1.0e+00 22 22 2.0 0.79 nan
divergent__ 0.50 0.50 0.71 0.0e+00 1.0 1.0 2.0 0.79 nan
energy__ 34 3.2 4.6 3.1e+01 37 37 2.0 0.79 nan
z[1] 0.70 0.014 0.020 6.9e-01 0.72 0.72 2.0 0.79 nan
Samples were drawn using hmc with nuts.
For each parameter, N_Eff is a crude measure of effective sample size,
and R_hat is the potential scale reduction factor on split chains (at
convergence, R_hat=1).
Current Version:
v2.17.1
That's puzzling. What platform are you using? That value is OK for I/O with my 64-bit Clang, and I can't imagine we're reading these in any other way.
int main() {
std::stringstream ss("1.5316e-322");
double x;
ss >> x;
std::cout << "x = " << x << std::endl;
}
Running it gives
$ clang++ foo.cpp
$ ./a.out
x = 1.5316e-322
P.S. This may wind up being a CmdStan issue if the problem's in the outside layer, or a Stan issue if it's in one of the underlying services.
Bob,
I am using an iMac running the latest OS. I will be away from my computer until Monday, but is there a software test you can suggest for me to run at that time? Perhaps I can change some of the stansummary code?
Michael
Sent from my phone
On Mar 2, 2018, at 15:36, Bob Carpenter [email protected] wrote:
That's puzzling. What platform are you using? That value is OK for I/O with my 64-bit Clang, and I can't imagine we're reading these in any other way.
int main() { std::stringstream ss("1.5316e-322"); double x; ss >> x; std::cout << "x = " << x << std::endl; } Running it gives
$ clang++ foo.cpp $ ./a.out x = 1.5316e-322 P.S. This may wind up being a CmdStan issue if the problem's in the outside layer, or a Stan issue if it's in one of the underlying services.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
The thing to do is always to take the case that is failing and create a small, reproducible example as a failing unit test.
Then fix the code so that it passes the test.
Stan's a big project with a lot of dev process at this point. If you want to dive in, here's the place to start:
https://github.com/stan-dev/stan/wiki/Developer-process-overview
- Bob
I think I've localized the problem. In line 316 of cmdstan-2.17.1/stan/src/stan/io/stan_csv_reader.hpp is the line
samples(row, col) = boost::lexical_cast
When I made a test program using this line, I get the same error.
Test program:
// test.cpp
#include <boost/lexical_cast.hpp>
#include
int main() {
double x = boost::lexical_cast
I ran this on an iMac running OS 10.13.3 using the following g++ version: prange% g++ --version Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1 Apple LLVM version 9.0.0 (clang-900.0.39.2) Target: x86_64-apple-darwin17.4.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
Thanks for creating that minimial example and tracking down the source.
The problem seems to be that the Boost lexical_cast function can only deal with normal double values; 1e-322 is too small to be represented as normal number---here are the double-precision numerical limits.
The Boost doc only suggests using std::stringstream for more control---as I showed above, that can handle the subnormal value (at least the clang++ on my Mac). I seem to remember using lexical_cast to get more control for some reason!
Here's a useful StackOverflow discussion---looks like a lot of this is going to be implementation dependent in the subnormal range, so I'm not sure there's going to be a general fix. But we might be able to go back to stringstream-based parsing.
Bob,
Since this issue is holding up a project I’m working on, I’ll switch my local installation to use stringstream instead of lexical_cast as a quick fix. I’ll report back if this produces any subsequent issues. Thanks.
Michael
On Mar 4, 2018, at 11:33 PM, Bob Carpenter [email protected] wrote:
The problem seems to be that the Boost lexical_cast function http://www.boost.org/doc/libs/1_66_0/doc/html/boost_lexical_cast/synopsis.html#boost_lexical_cast.synopsis.lexical_cast can only deal with normal double values https://en.wikipedia.org/wiki/Normal_number_(computing); 1e-322 is too small to be represented as normal number---here are the double-precision sizes https://en.wikipedia.org/wiki/Double-precision_floating-point_format#Double-precision_examples.
The Boost doc only suggests using std::stringstream for more control. I seem to remember using lexical_cast to get more control.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/stan-dev/stan/issues/2480#issuecomment-370307233, or mute the thread https://github.com/notifications/unsubscribe-auth/AV4cYvi_yuIpOtBR9MMOlSF8ULcx4T3Nks5tbMAAgaJpZM4SZ3Df.
Yes, if you can, that should be fine. If it works OK, would you mind submitting a patch for Stan itself? Or sharing your mods so that I can submit the patch?
Bob,
I've run 1000's of cases today and the patch has worked flawlessly. I don't know how to submit a patch, so I appreciate you offer to submit it for me. Here it is:
On line 316 of cmdstan-2.17.1/stan/src/stan/io/stan_csv_reader.hpp, replace
samples(row, col) = boost::lexical_cast
I'm sure there are lots of other places where the lexical_cast should be replace to gain more stability, but I've only replaced this one instance in order to fix my problem. I'll leave the larger design issue to the stan-dev team. Thanks for all your help. You really dug me out of a hole.
Michael