Addresses several issues with the opencl tests:

Model object does not correctly identify whether model was compiled with stan_opencl. Therefore, models were not recompiled within the tests when recompilation is needed. Resolved this by supplying a different exe_file for each tests so recompilation will always happen.
Tests referenced a metadata attribute that was not defined. I've added checks for opencl related attributes that are set when opencl is functioning properly.
Version test used a data input that was never created.

Submission Checklist

[X] Run unit tests
[X] Declare copyright holder and agree to license (see below)

Summary

Please describe the purpose of the pull request.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Max Planck Institute of Animal Behavior

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

Aug 08 '24 11:08 katrinabrock

It sounds like the actual issue here is that cmdstanr does not correctly identify whether model was compiled with stan_opencl, should we be changing that inside of the model class instead of changing this test?

Aug 08 '24 20:08 SteveBronder

@SteveBronder did you see this draft PR that's related to this issue? https://github.com/stan-dev/cmdstanr/pull/1023 Perhaps these should be combined?

Aug 14 '24 17:08 jgabry

Yes I think it would be good to merge these two PRs together into one

Aug 15 '24 14:08 SteveBronder

Sounds good, I'll polish that one up and combine them.

Any idea if the opencl tests are actually running in CI (as in is CMDSTANR_OPENCL_TESTS set? If not, why not?

Also trying to get hold of a windows machine so I can repo the windows unit test failures. If there's any way to run these on ubuntu with docker or similar, would love to hear.

Aug 20 '24 11:08 katrinabrock

guessing the test failures are due to this: https://github.com/r-lib/actions/issues/217

Aug 20 '24 12:08 katrinabrock

Sorry for the multiple pings in a day....but I found another related deepish problem.

This test seems to be erroneously passing in the master branch: https://github.com/stan-dev/cmdstanr/blob/master/tests/testthat/test-threads.R#L162-L166 As in, both STAN_THREADS is not successfully getting set to false (due to this issue: https://github.com/stan-dev/cmdstan/issues/1293), and mod$sample is not successfully checking whether or not the binary has that flag set....so the test is passing even though it shouldn't be.

MRE (from master branch):

> mod <- cmdstan_model(
+   file.path(cmdstan_path(), "examples", "bernoulli", "bernoulli.stan"),
+   cpp_options = list(stan_threads = FALSE),
+   force_recompile = TRUE
+ )
Compiling Stan program...
> mod$cpp_options()$STAN_THREADS
[1] TRUE

The fix is that somewhere we need to convert falsy values at the R level to completely unset the ENV variable at the bash/cpp level. I'm not sure exactly where in the code it makes the most sense to do this, but I'll take a stab at it. Alternatively, we could inform users that they need to set cpp_options = list(stan_threads = '') or cpp_options = list(stan_threads = NULL).

Aug 20 '24 16:08 katrinabrock

The fix is that somewhere we need to convert falsy values at the R level to completely unset the ENV variable at the bash/cpp level

No we shouldn't do anything here, since this is behaviour in Cmdstan - not cmdstanr. The options a user specifies should match those that are then set in make/local, otherwise a user would get different behaviour from the same options depending on the interface

Aug 20 '24 16:08 andrjohns

ok, I'll update the test and the docs then. That's much easier. :-)

Aug 21 '24 05:08 katrinabrock

Sorry for the wall of text, but I need some input before moving forward.

~I believe this PR working.~ [EDIT: Nope, there are still test failures, but that doesn't change the rest of this comment.]

It accomplishes:

opencl test fixes described in original description
fixes #765 (makes cmdrstan check compile args accurately)
makes behavior, warnings, and docs more consistent with cmdstan's interpretation of flags

Does not solve

No automatic recompile when provided cpp_options don't match binary

Looking for input on two topics:

1. What should the behavior be when a model object is from an already compiled model that was compiled with different args than provided in the `cmdstan_model` function?

Some Options:

Silently do nothing at creation, complain or fail at sampling (current behavior)
Warn at creation, keep current behavior at sampling
Automatically recompile at creation

More context: To me, auto-recompile makes the most sense from a user experience perspective. However, this is not the current behavior of cmdstan itself. Currently, cmdstan considers a model up to date even if build args have changed. Not sure if this is a bug or expected behavior that is not explicitly documented. Most relevant doc I could find.

2. How should we standardize storage and checking of cpp_options/build args/compile info? Relatedly, what should be the output of `mod$cpp_options()`?

There are two aspects at issue:

should the names be uppercase (as passed to cmdstan) or lowercase (consistent with R style)?
when the flag is not set, should the value be NULL/empty/absent (as passed to cmdstan) or FALSE/false (as retrieved from cmdstan model info)?

Some Options:

The status-quo is definitely bad. It is inconsistent on whether the names (keys) are uppercase or lowercase, leading to inaccurate check results.
The version I have in this PR currently uses lowercase (per this comment: https://github.com/stan-dev/cmdstanr/issues/765#issuecomment-2274393175) and FALSE (consistent with the current output of model_compile_info. Note: this changes the user-facing output of user-facing mod$cpp_options().
Standardizing on uppercase would change the user-facing output of mod$cpp_options() less, (but still some in certain cases).
We could set both uppercase and lowercase for maximum backward compatibility.
An imo cleaner, but bigger change would be to consider cpp_options and model_compile_info as more separate, but related concepts. cpp_options being the R/pre-compile version, model_compile_info being the post-compile version. We could compare whether they are consistent with one another, but not explicitly assign one to the other as done currently here: https://github.com/stan-dev/cmdstanr/blob/8b72eacff40e630547484fef438ee2699f62d456/R/model.R#L275. This proposal would mean adding a mod$compile_info() method based on the binary associated with the model, and leave in place mod$cpp_options() based on the arguments the user provides. Then we could complain, error out, recompile, or what have you at various points if they're inconsistent with one another.

Aug 21 '24 14:08 katrinabrock

One more wrinkle...prior to cmdstan v2.27.0, this "model info" does not exist. In that case, if the binary is already compiled, we have know way of knowing what flags were used.

For that matter...I think we are assuming throughout that the version of cmdstan the user currently has configured is the same as the version the binaries were compiled with. Currently, from what I can tell, we are not checking this assumption at all. We could check using model info when available, but before 2.27, all bets are off.

I'll have to think a bit about a clean way to resolve this. I'd love to hear if others have ideas.

Aug 22 '24 14:08 katrinabrock

@andrjohns I'm a bit swamped with work at the moment and haven't had a chance to look into this yet. Do you by chance have time to comment on this?

Aug 22 '24 19:08 jgabry

@jgabry @andrjohns I have a version I'm pretty happy with now. It stores what the user requested as cpp options and what the binary has configured separately, and (unless compile=FALSE) re compiles if there is a mismatch. I ran the CI on my fork and only the windows tests are failing. If someone on the team is able to give some feedback, especially on the change in functionality, that would be great! I'd love to get this in mergeable state. In any case, I will just use this version for my own purposes.

Sep 12 '24 07:09 katrinabrock

Hi, @andrjohns, @SteveBronder, and @jgabry: @katrinabrock showed up at the Stan meeting this week. She's working on this PR. She says she can fix the Windows issue, but before doing that, needs feedback on whether her changes are targeting what you think the behavior should be.

@katrinabrock: If you could remind everyone which specific things you need answered, that'd help.

@mitzimorris is going to take a first pass at working on requirements with @katrinabrock, but then there will probably be some questions on the final R integration.

Sep 26 '24 15:09 bob-carpenter

@mitzimorris @SteveBronder Here's my explanation of this change as promised:

Initial Symptom

In the status quo, when a model object is creating with an existing exe_file, the build args for this binary are essentially ignored (even though they are collected with the info command). In this case, when sample (or similar) is run, it does arg checking with inaccurate information about whether STAN_OPENCL and STAN_THREADS are set. This results in both false-positive and false-negative erroring and warning. (Error when everything is fine and no error when something is wrong.) Simple example here: https://github.com/katrinabrock/cmdstanr/commit/76973789c5f85031ffca4fce028d69db9609f9b8#diff-f9bf409eca909f7147a92f3deaf3fdeb15bc4590ef2e6ad46caf9bfedf863160R127 this testcase fails in the status quo.

My Changes

User-facing Changes in `initialize`and `compile` logic

If an exe_file exists when initialize is called and a recompile does not occur, cmdstanr uses the info command to get as much info as possible about the version and cpp options used to create this binary. When sample and other methods are subsequently run, it uses this info to determine if the args provided to sample (or sister method) are compatible with cpp options that were provided at compile time. (This is the main thing that was not working before.)
If an exe_file exists when initialize is called and the user provided cpp_options that do not match the output of info, a recompile is triggered. (Unless compile=FALSE, in that case, there is just a warning).
If the user provides cpp_options to initialize or compile, they are "remembered" for subsequent compilation/recompilations. For example, in my version cmdstan_model(..., cpp_options=SOMETHING) and cmdstan_model(..., cpp_options=SOMETHING, compile=FALSE); mod$compile() have the same outcome. In master they will have different outcomes: SOMETHING would be used in the first snippet, and ignored in the second.

Other User-facing Changes

For error and warning logic that involves checking cmdstan version. cmdstanr no longer assumes that the version it has currently configured is the same as the version the model was compiled with. Instead, it uses the output of info if available. If info is unavailable, it uses the version currently configured, but in that case only generates warnings, not errors.
- This change is only applied to cpp_option related functions.
mod$cpp_options() give a deprecation warning as it was a mixture between user provided cpp options and the output of info. I attempted to leave the behavior of this function unchanged. Did not thoroughly test the unchanged-ness.
mod$exe_info() gives the data from the info command as an R object.
- With update=TRUE the command is re-run before returning. The user should only need to explicitly add update=TRUE if they re-compiled outside of that model object (e.g. by running cmdstan directly).
mod$precompiled_options() gives the options that will be used if compile() method is run without explicitly passing cpp_options. If the user has provided cpp_options in past runs of initialize or compile, it will be those. names are standardized to lowercase. Values are unchanged.
If the user provides FALSE on an option we know is a flag, warn them that this will result in the flag being set. (e.g. list(opencl=FALSE) turns opencl ON).

Developer-facing changes to CmdStanModel

Previously private$precompile_cpp_options_ only held the user-specified options until a R-triggered compile takes place, then it would be set to null. Now, we keep this information around whether or these options have been applied to the binary.
Previously private$exe_file_ was only set if the exe_file exists (either passed by the user or in an R-triggered compile). Now, if the exe_file does not exist, this variable will still be set to the path that we think it will be written to.
private$exe_info_ new attribute separate from cpp_options_ and precompile_cpp_options_. Holds info gleaned from cmdstan's post-compile info command.

Test related changes

added with_mock_cli, expect_mock_compile, and expect_no_mock_compile that allow developers to test the cmdstanr recompile logic that with developer-defined outputs of cmdstan. This both enables a cleaner separation between cmdstan logic and cmdstanr logic and also allows tests of recompile logic to run much faster.
since logic around stan features now uses model-specific stan version where possible, fake_cmdstan_version can now update this model-specific version as well as the package-level version
For a bunch of tests that are not intended to test recompile logic, I added recompile=TRUE. These tests are intended to test something that happens after compilatio. Sometimes (even with my changes) a binary is leftover from a previous test and a recompile was not properly triggered. This resulted in a mismatch between the binary and the model object.

Sep 27 '24 10:09 katrinabrock

Thank you! These changes totally make sense now and handle a lot of the issues we have from dealing with a make build system. Monday I will give this a harder read through and have some Qs

Sep 27 '24 21:09 SteveBronder

Apologies for the extended delay @katrinabrock! I'm catching back up on Stan dev this week, and have better access to an OpenCL system now. I'll put aside some time to review on Wednesday

Sep 30 '24 15:09 andrjohns

@katrinabrock hi my apologies for the delay. Looking at the tests it looks like the new ones for the compile options are failing on windows?

Oct 16 '24 21:10 SteveBronder

Hi @SteveBronder,

Thanks for taking a look. That's correct. As mentioned in @bob-carpenter's comment above, right now, I'm not looking for approval of the code. I'd like to make sure that the functionality I'm trying to implement is the functionality desired by the core dev team.

Does everything I've listed in the comment above from three weeks ago seem like the right set of changes?

If so, I'll go ahead and diagnosis and resolve the windows issue. If you would like me to make a different set of logic changes, I'd like to settle on what the logic should be and implement that before worrying about the cross platform aspect.

Oct 17 '24 10:10 katrinabrock

Thanks, @katrinabrock, that sounds totally reasonable.

Oct 21 '24 14:10 bob-carpenter

@SteveBronder when you have a chance do you mind taking another look, based on @katrinabrock's most recent comment. I think you're much more familiar with all these CmdStan C++ options than I am. It sounded like @katrinabrock was looking for confirmation that the outlined approach in https://github.com/stan-dev/cmdstanr/pull/1022#issuecomment-2379006621 is reasonable. It seems reasonable to me but it would be good to get either you (Steve) or @andrjohns to confirm too.

Nov 08 '24 18:11 jgabry

I think @mitzimorris could also look over the logic and see if there are any concerns since we would likely want to have cmdstanpy and cmdstanr in sync.

Nov 10 '24 10:11 katrinabrock

I think the logic is sound, w/r/t a compiled model.

Re CmdStanPy, OpenCL support hasn't been implemented - see discussion here: https://discourse.mc-stan.org/t/cmdstanpy-with-gpu-opencl/28064/1

CmdStanPy's compile function recognizes cpp_options, but there are no corresponding arguments in the sample method for the opencl command line arguments - cf https://mc-stan.org/docs/cmdstan-guide/parallelization.html#running-2

Nov 14 '24 17:11 mitzimorris

Thanks @mitzimorris for checking the logic. And thank you @katrinabrock for working on this!! It's great to have new contributors.

Nov 14 '24 23:11 jgabry

@mitzimorris Thanks for the input!

@jgabry Between you and mitzi, do you think this is enough approval to go forward with this logic? Or do we still need @SteveBronder or @andrjohns to look at it?

Nov 18 '24 12:11 katrinabrock

I trust @mitzimorris' judgement on this, so I think it's ok to proceed.

It would be great if @SteveBronder can help review the PR when it's ready since he knows a lot more about these particular aspects of CmdStan than I do. @andrjohns would also be great if he has the time (I know he's been busy lately).

Thanks again @katrinabrock for working on this!

Nov 18 '24 17:11 jgabry

Apologies for my slow review! I have a few design questions below. But overall this is a very nice fix for the issue. I'll continue with a bit of review tomorrow. I still need to read over a lot of the tests

Thanks @SteveBronder, much appreciated! And thanks in advance for continuing tomorrow. I responded to a few of the comments that were either directed at me or that I knew the answer to, I'll let @katrinabrock address the others.

But overall this is a very nice fix for the issue.

Great, this is definitely something I trust your judgment on more than mine!

Nov 18 '24 21:11 jgabry

@SteveBronder Thanks for the review. I responded to the comments where I had something to add from memory. Most of them either I will update the code based on your suggestion or I need to look a little closer and rerun the code to properly answer you question. It will probably be a few weeks before I can devote a chunk of time to updating everything.

With the mocks, I think it will make sense if you play around with them. They are doing what I want, but certainly can be designed better.

Nov 19 '24 10:11 katrinabrock

Most of them either I will update the code based on your suggestion or I need to look a little closer and rerun the code to properly answer you question.

Sounds good. Thanks again for working on this @katrinabrock!

Nov 19 '24 20:11 jgabry

All good and thank you for the PR! I should have time Friday to mess with the test issues

Nov 20 '24 19:11 SteveBronder

@SteveBronder @jgabry I think I've addressed everything from @SteveBronder's last review. Based on his last comment, I think that review was partial and he wanted to look at the tests a bit more.

Based on the comments thus far, seems like the overall approach is sound so it will be worthwhile to fix the windows issue. I haven't done that yet, but I plan to.

Dec 18 '24 15:12 katrinabrock

Improve Handling of `cpp_options`

Submission Checklist

Summary

Copyright and Licensing

1. What should the behavior be when a model object is from an already compiled model that was compiled with different args than provided in the cmdstan_model function?

2. How should we standardize storage and checking of cpp_options/build args/compile info? Relatedly, what should be the output of mod$cpp_options()?

Initial Symptom

My Changes

User-facing Changes in initializeand compile logic

Other User-facing Changes

Developer-facing changes to CmdStanModel

Test related changes

1. What should the behavior be when a model object is from an already compiled model that was compiled with different args than provided in the `cmdstan_model` function?

2. How should we standardize storage and checking of cpp_options/build args/compile info? Relatedly, what should be the output of `mod$cpp_options()`?

User-facing Changes in `initialize`and `compile` logic