JumpProcesses.jl Spatial SSAs progress and TODOs

This summer I am working on implementing spatial SSAs. There are currently two issues open about it: https://github.com/SciML/DiffEqJump.jl/issues/107 and https://github.com/SciML/DiffEqJump.jl/issues/130. The first PR https://github.com/SciML/DiffEqJump.jl/pull/183 implements the NSM, adds two tests and some utility functions, essentially setting up the interface for spatial SSAs. LightGraphs are supported as well. There are a few TODOs that should not be forgotten:

[x] The citation must be added to the just-implemented NSM.
[x] Store numspecies in aggregation?
[x] More comprehensive tests for spatial SSAs are needed. The two current tests only test spatial systems on 1D grids.
[x] The CartesianGrid struct needs to be optimized -- probably by storing the number of neighbors for each site so num_neighbors is O(1) and making neighbors return an iterator instead of allocating an array.
[x] Make CartesianGrid use CartesianIndices instead of doing all the calculations redundantly.
[x] An interface for sampling a neighbor to hop to can be a nice but not necessary perk. For example, using https://docs.julialang.org/en/v1/stdlib/Random/#Random.Sampler.
[x] Various modes of hopping constants need to be supported. In particular, we should support hopping rates of forms D_{s,i}, D_s * L_{i,j}, L_{s,i,j}, D_{s,i}*L_{i,j}, where D_s is the diffusivity constant of species s and L is the discrete Laplacian matrix. Only the first of these is implemented in the PR. This change will require tweaking the code in nsm.jl.
[ ] The escaping boundary condition for CartesianGrid needs to be supported. This can be done with a special sink site: it neighbors all boundary sites and does not get updated if it is the target site of a hop. This change will require tweaking the code in nsm.jl.
[x] Constant-rate jumps should be supported. Currently, only mass-action jumps are allowed. This change will require tweaking the code in nsm.jl.
[x] Implement the flattening SSA: turn all spatial hops into reactions and simulate using a non-spatial SSA. Can modify https://github.com/SciML/DiffEqJump.jl/pull/131 to do that.
[x] Accept variable rx rates in flattened SSA.
[x] Optimize allocations in HopRatesGeneral: switch to using cumulative sums to sample neighbor.
[x] implement spatially varying massaction jump rates, including flattening.
[x] add docstrings on all structs and functions explaining what things are.
[x] make a tutorial for setting up a spatial problem in different ways (cartesian grid, arbitrary graph) and solving with different solvers (NSM, DirectCRonDirect, flattening). For example, use A+B <--> C. Make a PR to SciMLTutorials.jl.
[x] setup a comprehensive benchmarking suite for spatial solvers and make a PR to SciMLBenchmarks.jl
[x] ~add jump counter to SSAStepper to count jumps.~ This can be easily done with a callback as in the benchmark.
[ ] Choose a default SSA (probably DirectCRRSSA). Can also do it for non-spatial solvers.
[ ] compare with other software like urdme to see how the spatial solvers in DiffEqJump stack up.
[ ] Optimize CartesianGrid. It should be faster than a Graph.
[ ] Compute the hopping rates out of the site for each site at the aggregation step of the simulation (instead of essentially recomputing it every time we evaluate the hop rate, e.g. here https://github.com/SciML/DiffEqJump.jl/blob/master/src/spatial/hop_rates.jl#L115). Take, for example, hopping rates of form Dsi. The quantity hop_rates.hopping_constants[species,site]*outdegree(spatial_system, site) is always the same for any given site. If we precompute this quantity we won't need to store the number of neighbors for each site in the CartesianGrid struct resulting in less memory used. And we would also avoid one multiplication and one pointer access for each update of hopping rates. [EDIT]: maybe store the outdegrees as a dict, where all the interior sites do not appear. This would make the nums_neighbors array O(n^2) instead of O(n^3) where n is linear dimension of the grid.
[x] Add more tests to thoroughly test the spatial functionality.
[x] Add more spatial solvers. DirectCR-RSSA.
[ ] (Longer-term) Update the tutorial https://tutorials.sciml.ai/html/jumps/spatial.html. Add more known simple examples with pretty pictures. More visual examples, comparing stochastic solution to deterministic solution. Maybe add a spatial SIR example. Maybe use Catalyst for ease of understanding.
[ ] (Longer-term) Updating the DifferentialEquations.jl https://diffeq.sciml.ai/stable/tutorials/discrete_stochastic_example/ with some spatial examples.
[ ] (Longer-term) Add benchmarks of models from other papers like https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005865.

@isaacsas and @ChrisRackauckas might have more TODOs to add to this list.

Jul 08 '21 00:07 Vilin97

Additional features to plan for:

[x] Spatially varying reaction rates (modify MassActionJumps?)
[ ] Scaling reaction rates by site-dependent factor (e.g. area/volume).
[x] How to handle spatial ConstantRateJumps? Assume they take a site index too as input?

Optimizations to try after benchmarking:

@inbounds throughout.

Jul 08 '21 14:07 isaacsas

Some more stuff:

[ ] Spatially distributed reactions and product placement (nearest neighbors, but also neighbors within some stencil)
[ ] Spatial SSA that follows individual particles or just non-empty voxels (often faster for systems with less particles than spatial sites).
[ ] Tooling for generating hopping rates by discretization for diffusion, advection-diffusion, and drift-diffusion on structured and unstructured grids.

Aug 06 '21 16:08 isaacsas

I am starting to think about SpatialConstantRateJump, analogous to SpatialMassActionJump. We need rate and affect! from each user. I have some questions:

Should affect! be allowed to affect other cells? Or only the cell where the reaction happened. I guess there is no real way to stop the user from passing a function that modifies other cells, because affect! takes an integrator as input, and I don't think we should change that interface.
We expect the user to supply the dependency graph if using constant rate jumps, right?
Should rate take the full state vector as input or only the state at the current cell? I am thinking the latter. In that case the function signature is the same rate(u,p,t), where u is the vector of species in the current cell. If we go with the former, than the signature would be rate(u,p,t,i), where u is now the species-site matrix, and i is the site index.

@isaacsas , what do you think?

Sep 02 '23 01:09 Vilin97

We need to provide the location or there would be no way to have a rate that is spatially varying. For example, zero at some sites and non zero at others.

Sep 02 '23 01:09 isaacsas