[Feature Request] USER-DPD fix_rx segfaults when input file is invalid
USER-DPD fix_rx segfaults when its input file (such as kinetics.dpdrx) is invalid (for example, if there is a stray line). Ideally it should error out gracefully and notify the user the file is invalid without seg-faulting.
==129793== Invalid read of size 1 ==129793== at 0x4088098: strlen (vg_replace_strmem.c:454) ==129793== by 0x1017C00B: LAMMPS_NS::FixRX::post_constructor() (fix_rx.cpp:319) ==129793== by 0x10449587: LAMMPS_NS::Modify::add_fix(int, char**, int) (modify.cpp:900) ==129793== by 0x10153083: fix (input.cpp:1559) ==129793== by 0x10153083: LAMMPS_NS::Input::execute_command() (input.cpp:797) ==129793== by 0x1015373B: LAMMPS_NS::Input::file() (input.cpp:244) ==129793== by 0x100233B3: main (main.cpp:47) ==129793== Address 0x0 is not stack'd, malloc'd or (recently) free'd
@timattox
@stanmoore1 similar things can happen with many, if not most files that LAMMPS is reading, e.g. potential files. LAMMPS also doesn't check in many places, if fread() or fscanf() fails. What is often worse is, that in some cases LAMMPS will read incorrect/incomplete data and continue, if the file format is not properly matched.
In this case I believe a one-line check for a NULL pointer would solve the problem. I just hate segfaults :)
Perhaps we could add file checking/similar software quality improvements to our list of long-term improvement goals for LAMMPS.
i hate segfaults, too. but i hate file formats, that are not self-descriptive even more. i think this is an item for an in-depth developer discussion at the next LAMMPS workshop. this not only affects file reading, but also argument parsing, and there is a significant potential to make LAMMPS more resilient and maintainable as well as simplifying a lot of code and making it more consistent. ahhhh... one can always dream...
@timattox and @stanmoore1 is this still a valid issue?
AFAIK, this is still an open issue, but we can't work on it until after April 15th.
@stanmoore1 similar things can happen with many, if not most files that LAMMPS is reading, e.g. potential files. LAMMPS also doesn't check in many places, if fread() or fscanf() fails.
I've been looking at fixing those sections up to get cleaner build logs, since they generate a lot of compiler warnings about unused return values. Right now I'm just assigning the return value to an used variable. I would be happy to do things The Right Way and handle cases where the retval isn't correct based on what ferror()/feof() say. Maybe wrap up the calls to a utility function.
Can't promise a time frame, based on the number of fread/fscanf calls I'm seeing, but happy to plug away as time permits.
@stanmoore1 similar things can happen with many, if not most files that LAMMPS is reading, e.g. potential files. LAMMPS also doesn't check in many places, if fread() or fscanf() fails.
I've been looking at fixing those sections up to get cleaner build logs, since they generate a lot of compiler warnings about unused return values. Right now I'm just assigning the return value to an used variable. I would be happy to do things The Right Way and handle cases where the retval isn't correct based on what ferror()/feof() say. Maybe wrap up the calls to a utility function.
Can't promise a time frame, based on the number of fread/fscanf calls I'm seeing, but happy to plug away as time permits.
@joeweaver thanks for your interest to help out. Please note that a lot of things have changed in the core LAMMPS code since this issue was posted and particularly since we switched to C++11 over a year ago and did some significant refactoring (which is still ongoing as time permits). We have a bunch of helpful utility functions and special purpose classes for redundant tasks like reading (potential) files or parsing text or values with validity checks (unlike atoi() or atof() and similar). This has been applied to a significant chunk of the core LAMMPS code and also parts of the packages.
Thanks for updating me on this, especially the pointer on the utility functions. I believe I'm still seeing some bare calls to fread and such on an up-to-date source tree. (Example, variable.cpp line 715). I'd be happy to fix these up using the util functions and submit PRs as I come across them.
Thanks for updating me on this, especially the pointer on the utility functions. I believe I'm still seeing some bare calls to fread and such on an up-to-date source tree. (Example, variable.cpp line 715). I'd be happy to fix these up using the util functions and submit PRs as I come across them.
That specific use in the Variable class does not need updating, since it is reading a file that was just written and great care is taken to make certain the file is always present when it needs to be. Before that step has a chance to fail, other parts of LAMMPS will have failed already.
The utils namespace functions should be mostly needed when reading (binary) restart files. For reading parameter or potential files, it is usually not worth the effort of updating those, since those should be refactored to use the (Potential)FileReader classes and then do the parsing with the ValueTokenizer class. If that is done well, then the amount of code required to read/parse a file will shrink substantially and the checks only need to be done in those classes and failures will be converted to exceptions. That also requires in some cases some code restructuring so that the reading is only done on MPI rank 0 and the parsed data then communicated (rather than broadcasting each line and parsing it on every MPI rank).
That make sense to me. It leaves a few compiler warnings during builds, but not the end of the world.