brian2 icon indicating copy to clipboard operation
brian2 copied to clipboard

Pragmatic solution to parameter changes in standalone mode

Open mstimberg opened this issue 3 years ago • 6 comments

We have been discussing for a long time how we can make things like parameter explorations more convenient with standalone mode (see discussions about "encapsulation mode" https://github.com/brian-team/brian2/issues/251). I recently thought about this a bit more, and came up with an intermediate solution that I'm really happy about. It has the advantage that 1) it already works with this PR and 2) it requires minimal changes to code and devices like brian2cuda. The basic idea is:

  1. To use command line arguments in the main binary to set variable values. This value can either be a scalar, or the name of a file containing the values (which makes it possible to specify whole arrays, which otherwise wouldn't be really convenient via the command line).
  2. To make the results directory configurable by command line arguments

The difference of 1) with what we discussed earlier (#251) is that I enabled this for arbitrary variables of the model – this removes the whole need for a discussion of syntax to define which variables should be changeable by a user (Parameter class, etc.). The addition of 2) makes it immediately possible to use this for parallel simulations. The syntax in device.run (see examples below) still needs to be improved, transforming the initialization into command line arguments is of course not something that the user should do. Oh, and don't look too closely at the C++ code for now :) The main thing that is missing functionality-wise for now is the ability to set arbitrary constants defined outside of equations. Your towards_encapsulation branch already had a change where instead of "freezing" the constants, it would put them as variables into a file. Applying this change here would make implementing this functionality almost trivial, but I'd like to leave this out of the very first version of this feature, since it requires bigger changes in devices like brian2cuda – you can work around this limitation by including a variable like tau : second (shared, constant) in your model and setting this instead of an external constant.

Examples

Here are a few things that are possible with these changes:

Run a simulation several times to get new random number instantiations, without recompiling the code (this was already possible before, just mentioning it here for completeness):

# "simple case" example which runs a model several times to
# explore the effect of randomness (initialization here) 
set_device('cpp_standalone')
tau = 10*ms
eqs = 'dv/dt = (1-v)/tau : 1'
G = NeuronGroup(10, eqs, method='exact')
G.v = 'rand()'
mon = StateMonitor(G, 'v', record=True)

# Normal run
run(100*ms)
results = [mon.v[:]]

for _ in range(8):
    # Do 8 more runs
    device.run()
    results.append(mon.v[:])

To run this in parallel, you'd previously had to change the project directory for each process, which meant that each process would compile everything again. With this PR, you can compile it once and then run it in different result directories (#1240):

set_device('cpp_standalone', build_on_run=False)
# ... same model as before

run(100*ms)
device.build(run=False)  # compile once

def do_run(run_number):
    device.run(results_directory=f'results_{run_number}')
    return mon.t[:], mon.v[:]

import multiprocessing
p = multiprocessing.Pool()
results = p.map(do_run, np.arange(9))

And now the main change: setting an initial value via the command line. This will set the tau value (now included in the model as discussed above):

eqs = '''dv/dt = (1-v)/tau : 1
tau : second (constant, shared)
'''
# ...
# Normal run
run(100*ms)

results = [mon.v[:]]

for tau in np.linspace(5, 50, 8)*ms:
    # Do 8 more runs
    device.run(run_args=[f"neurongroup.tau={float(tau)}"])
    results.append(mon.v[:])

And by going via a file, you can set the initial values of v to whatever you like:

for index in range(8):
    # Save file with values
    fname = f"v_values_{index}.dat"
    (np.random.randn(10)*0.05 + index*0.1).tofile(os.path.join(device.project_dir, fname))
    device.run(run_args=[f"neurongroup.v={fname}"])
    results.append(mon.v[:])

The latter examples could also run things in parallel by combining this with the results_directory argument.

Again, not claiming that we should keep the run_args syntax (or make the user store the values to a file), but the great thing about this is that it already works and that it will work with minimal changes for brian2cuda, brian2genn, etc., since they leave the loading/saving values from/to disk to Brian. It should also be noted that this actually already provides a C++ API to change parameters for a run—admittedly an ugly one: you could link main.cpp and call something like main(2, {"mycall", "neurongroup.tau=0.05"}); from your C++ code...

I feel like this feature (with a syntax that replaces setting run_args manually) would make a lot of people happy, and is actually feasible with minimal effort. What do you think, @thesamovar ?

mstimberg avatar Jul 27 '22 16:07 mstimberg

If this works it is already a lot better than what I did in the towards encapsulation branch which is hopelessly out of date and presumably no longer rescuable.

We probably want to think about the syntax a bit, but that's fine. There's also the question of when these command line arguments get run which might introduce some issues?

There's still no solution I think to the problem of making number of neurons a variable that you can change, right? I think that needs a more fundamental change.

But I think this is worth doing. We want to have this functionality and if we wait for the perfect solution we'll never have it.

thesamovar avatar Jul 27 '22 16:07 thesamovar

I'm away for August so if we want to discuss syntax, can we do that tomorrow/Friday?

thesamovar avatar Jul 27 '22 16:07 thesamovar

Crazy the real world human use this gets I mean human test subjects

Sent from my iPhone

On 27 Jul 2022, at 9:50 am, Dan Goodman @.***> wrote:

 I'm away for August so if we want to discuss syntax, can we do that tomorrow/Friday?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

Bitloader22 avatar Jul 27 '22 19:07 Bitloader22

There's also the question of when these command line arguments get run which might introduce some issues?

Currently, they set the variables just before the run starts, which I think is the best place, since it makes it override anything that was set before.

There's still no solution I think to the problem of making number of neurons a variable that you can change, right? I think that needs a more fundamental change.

You are right, this would be a more fundamental change and is not supported with the solution I present here. We hardcode constants like this as unchangeable all over the code, so I don't see any quick solution.

I'm away for August so if we want to discuss syntax, can we do that tomorrow/Friday?

Happy to discuss here (or over video or something) any time this afternoon or tomorrow. But I will be away for August as well, so it won't change much :blush:

mstimberg avatar Jul 28 '22 07:07 mstimberg

We could have a bit of syntax that allows you to choose where to set the values. Allows more flexibility because then the code can do some derived values from that, e.g.

set_values_here()
G.tau_inh = 10*G.tau_exc
...
run(tau_exc=5*ms)

Let's discuss more in September if you're also away in August.

thesamovar avatar Jul 28 '22 13:07 thesamovar

We could have a bit of syntax that allows you to choose where to set the values. Allows more flexibility because then the code can do some derived values from that

Yes, worth discussing! I wonder whether it is worth the complication, given that in most cases you could do the same by simply setting dependencies explicitly, e.g. in your example

run(tau_exc=5*ms, tau_inh=50*ms)

A lightweight solution could also be to include this information in the run call and reuse the existing code slots (https://brian2.readthedocs.io/en/stable/user/computation.html#custom-code-injection). The default would be before_network_run, but you could also use after_start to set things before the usual initialization. Maybe in that case it would be good to have additional code slots, actually, since most of the existing ones are not very useful here (e.g. before_start would set values before they are initialized with zeros, and everything from after_network_run on wouldn't affect the simulation).

Let's discuss more in September if you're also away in August.

Sounds good, let's find a time over email.

mstimberg avatar Jul 28 '22 14:07 mstimberg

@thesamovar I think this is more or less ready to go – not quite mergeable yet, since the C++ code is still a bit convoluted, but let's make sure that we agree on syntax before I finalize things.

This now supports all the use cases I mentioned above: repeatedly running a simulation to get different random values, setting variables to scalars, and setting variables with arrays. Together with the new option to choose the results directory, this can also be used with several simulations running in parallel.

The current syntax does not use direct keyword arguments to device.run (e.g. device.run(tau_exc=5*ms)), but instead reuses the existing run_args. I don't think we can go with simple keyword arguments, since we have to specify the group as well. The syntax is not quite as ugly as above anymore, though: instead of writing device.run(run_args=['neurongroup.tau_exc=0.005']) (which would actually still work), you'd write device.run(run_args={G.tau_exc: 5*ms}).

As I argued earlier, I like such a general mechanism that allows changing parameters without needing to change in the existing source code (e.g. for brian2wasm). But I certainly also see your point about Brian's "be explicit" principle, i.e. to make it clear in the model description which parameters are supposed to be changed later. Maybe in the long run there'd be room for both approaches? My feeling is that it would be good to release some mechanism soon, and then see what people use it for...

I triggered a documentation build for this branch for easier reading of the new docs. The new syntax is explained here: https://brian2.readthedocs.io/en/standalone_parameter_change/user/computation.html#multiple-full-simulation-runs Some notes on the implementation are here: https://brian2.readthedocs.io/en/standalone_parameter_change/developer/standalone.html#command-line-arguments

mstimberg avatar Oct 07 '22 15:10 mstimberg

It'll be a little while before I have time to do this unfortunately. But it's going on my priority to-do list.

thesamovar avatar Oct 07 '22 15:10 thesamovar

No worries. I've also used the opportunity to ask for feedback on the discussion forum.

mstimberg avatar Oct 10 '22 10:10 mstimberg

It's been a while (almost a year!). This is now ready-to-merge from my side, I'm quite happy with it and already started using it myself (in particular for brian2wasm, but that's another topic). Not sure whether anyone will have time to look at the code in the near future, but at least some general feedback from the user-interface side would be great! Following the discussion on the discourse forum, I actually added two more features: you can now also set synaptic variables (which can effectively replace store/restore in some train/test scenarios), and the values in a TimedArray. I think it is reasonably well documented, and I also added/adapted a few examples to use this feature (potentially, tutorial 3 could also introduce it?). A few links:

mstimberg avatar Jul 25 '23 14:07 mstimberg

I am going to ignore our usual procedure here, and will merge this now as it is without waiting for review (which would probably take a while to come in with summer holidays and all that). This is a major addition, but I am quite happy with it, and I'd say it is fairly well tested and documented. One important reason for merging it is also that I'd like to get users to use/test this feature, and that is easier if they can install the package from testpypi. But it also simply becomes hard to keep such long-running PRs updated, adding significant work. Of course, the fact that I merge it now doesn't mean that you shouldn't have a closer look at it, please do! Let's consider this making me the case for post-publication review (or EAFP) :wink:

mstimberg avatar Aug 01 '23 13:08 mstimberg