Spatial SSAs progress and TODOs
This summer I am working on implementing spatial SSAs. There are currently two issues open about it: https://github.com/SciML/DiffEqJump.jl/issues/107 and https://github.com/SciML/DiffEqJump.jl/issues/130. The first PR https://github.com/SciML/DiffEqJump.jl/pull/183 implements the NSM, adds two tests and some utility functions, essentially setting up the interface for spatial SSAs. LightGraphs are supported as well. There are a few TODOs that should not be forgotten:
- [x] The citation must be added to the just-implemented NSM.
- [x] Store numspecies in aggregation?
- [x] More comprehensive tests for spatial SSAs are needed. The two current tests only test spatial systems on 1D grids.
- [x] The
CartesianGridstruct needs to be optimized -- probably by storing the number of neighbors for each site sonum_neighborsis O(1) and makingneighborsreturn an iterator instead of allocating an array. - [x] Make
CartesianGriduseCartesianIndicesinstead of doing all the calculations redundantly. - [x] An interface for sampling a neighbor to hop to can be a nice but not necessary perk. For example, using https://docs.julialang.org/en/v1/stdlib/Random/#Random.Sampler.
- [x] Various modes of hopping constants need to be supported. In particular, we should support hopping rates of forms D_{s,i}, D_s * L_{i,j}, L_{s,i,j}, D_{s,i}*L_{i,j}, where D_s is the diffusivity constant of species s and L is the discrete Laplacian matrix. Only the first of these is implemented in the PR. This change will require tweaking the code in
nsm.jl. - [ ] The escaping boundary condition for
CartesianGridneeds to be supported. This can be done with a special sink site: it neighbors all boundary sites and does not get updated if it is the target site of a hop. This change will require tweaking the code innsm.jl. - [x] Constant-rate jumps should be supported. Currently, only mass-action jumps are allowed. This change will require tweaking the code in
nsm.jl. - [x] Implement the flattening SSA: turn all spatial hops into reactions and simulate using a non-spatial SSA. Can modify https://github.com/SciML/DiffEqJump.jl/pull/131 to do that.
- [x] Accept variable rx rates in flattened SSA.
- [x] Optimize allocations in
HopRatesGeneral: switch to using cumulative sums to sample neighbor. - [x] implement spatially varying massaction jump rates, including flattening.
- [x] add docstrings on all structs and functions explaining what things are.
- [x] make a tutorial for setting up a spatial problem in different ways (cartesian grid, arbitrary graph) and solving with different solvers (NSM, DirectCRonDirect, flattening). For example, use A+B <--> C. Make a PR to SciMLTutorials.jl.
- [x] setup a comprehensive benchmarking suite for spatial solvers and make a PR to SciMLBenchmarks.jl
- [x] ~add jump counter to SSAStepper to count jumps.~ This can be easily done with a callback as in the benchmark.
- [ ] Choose a default SSA (probably
DirectCRRSSA). Can also do it for non-spatial solvers. - [ ] compare with other software like urdme to see how the spatial solvers in
DiffEqJumpstack up. - [ ] Optimize
CartesianGrid. It should be faster than aGraph. - [ ] Compute the hopping rates out of the site for each site at the aggregation step of the simulation (instead of essentially recomputing it every time we evaluate the hop rate, e.g. here https://github.com/SciML/DiffEqJump.jl/blob/master/src/spatial/hop_rates.jl#L115). Take, for example, hopping rates of form
Dsi. The quantityhop_rates.hopping_constants[species,site]*outdegree(spatial_system, site)is always the same for any given site. If we precompute this quantity we won't need to store the number of neighbors for each site in the CartesianGrid struct resulting in less memory used. And we would also avoid one multiplication and one pointer access for each update of hopping rates. [EDIT]: maybe store the outdegrees as a dict, where all the interior sites do not appear. This would make the nums_neighbors array O(n^2) instead of O(n^3) where n is linear dimension of the grid. - [x] Add more tests to thoroughly test the spatial functionality.
- [x] Add more spatial solvers. DirectCR-RSSA.
- [ ] (Longer-term) Update the tutorial https://tutorials.sciml.ai/html/jumps/spatial.html. Add more known simple examples with pretty pictures. More visual examples, comparing stochastic solution to deterministic solution. Maybe add a spatial SIR example. Maybe use Catalyst for ease of understanding.
- [ ] (Longer-term) Updating the DifferentialEquations.jl https://diffeq.sciml.ai/stable/tutorials/discrete_stochastic_example/ with some spatial examples.
- [ ] (Longer-term) Add benchmarks of models from other papers like https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005865.
@isaacsas and @ChrisRackauckas might have more TODOs to add to this list.
Additional features to plan for:
- [x] Spatially varying reaction rates (modify MassActionJumps?)
- [ ] Scaling reaction rates by site-dependent factor (e.g. area/volume).
- [x] How to handle spatial
ConstantRateJumps? Assume they take a site index too as input?
Optimizations to try after benchmarking:
-
@inboundsthroughout.
Some more stuff:
- [ ] Spatially distributed reactions and product placement (nearest neighbors, but also neighbors within some stencil)
- [ ] Spatial SSA that follows individual particles or just non-empty voxels (often faster for systems with less particles than spatial sites).
- [ ] Tooling for generating hopping rates by discretization for diffusion, advection-diffusion, and drift-diffusion on structured and unstructured grids.
I am starting to think about SpatialConstantRateJump, analogous to SpatialMassActionJump. We need rate and affect! from each user. I have some questions:
- Should
affect!be allowed to affect other cells? Or only the cell where the reaction happened. I guess there is no real way to stop the user from passing a function that modifies other cells, becauseaffect!takes an integrator as input, and I don't think we should change that interface. - We expect the user to supply the dependency graph if using constant rate jumps, right?
- Should
ratetake the full state vector as input or only the state at the current cell? I am thinking the latter. In that case the function signature is the samerate(u,p,t), whereuis the vector of species in the current cell. If we go with the former, than the signature would berate(u,p,t,i), whereuis now the species-site matrix, andiis the site index.
@isaacsas , what do you think?
We need to provide the location or there would be no way to have a rate that is spatially varying. For example, zero at some sites and non zero at others.