EET next steps
The technical memorandum describing the results of implementation and testing of explicit error terms (revealed a number of potential activities that could be pursued to better understand model outcomes and improve computational performance for both Monte Carlo (MC) simulation as well as simulation with explicit error terms (EET).
-
More detailed analysis of changes outside the affected area in the transit scenario. This analysis should initially focus on two key model components, possibly uncovering other issues, and potentially leading to software development activities: a. Disaggregate accessibilities: Explore how disaggregate accessibilities are affected by the build scenario and how those accessibilities are merged with households. b. Simulation-based constraint mechanism: Explore how the constraint mechanism iterates towards a solution and how the random number sequences used in the mechanism may be better controlled to reduce simulation variance. There are ways to take advantage of explicit error terms in the constraint mechanism that may reduce simulation variance, reduce the number of decision-makers who are made worse off by the introduction of constraints, and potentially even reduce runtime (see issue #815)
-
We suggest several avenues to improve software performance. Note that these features may improve the performance of both MC and EET simulation methods. a. The current method used to draw random numbers could be replaced with a more modern method. The current method (Mersenne Twister) is based on a Java implementation that is around 20 years old. Newer methods carry much less overhead. This has the potential to improve the speed of both MC and EET simulation. b. The method used for sampling of alternatives could be replaced with a simpler method. The current method is referred to as ‘importance-based sampling’, and it is used to select a sample of alternatives that generally follows the utility distribution of the full choice model. It uses a simple destination choice model, where distance is a substitute for a mode choice logsum, to generate this sample. Then mode choice logsums are calculated for each destination in the sample and the full destination choice model is run for this much smaller set of alternatives. Simple random sampling and stratified random sampling are alternative methods. In simple random sampling, the choice set is drawn randomly from all alternatives (considering only alternatives with a positive size term). This could result in some choosers with a set of alternatives whose utilities are all very small. Stratified random sampling ensures that this is unlikely by using a districting system to control the sample. Either method would be much faster than the existing approach. c. The user could use MC simulation for sampling alternatives and EET for the full choice model, in order to reduce the runtime associated with sampling of alternatives for EET. However, this may introduce more simulation variance into the results.
-
Test the explicit error term code against the Monte Carlo simulation code for a real-world economic appraisal, as a measure of effectiveness of the code and the software enhancements recommended above.
@dhensle , could you verify if this the most up to date branch for EET? https://github.com/ActivitySim/activitysim/tree/explicit_error_terms
If not, could you please update that branch so it has all of the latest commits from you and Jan.
Once this is done, please ping @sumitbindra . He is planning on doing some testing on random number generators to help the consortium with scoping the EET work for Phase 11C.
@bwentl, I can confirm that the link you have there is the most up-to-date branch for EET. (The confusion was because we actually used Jan's branch here to run the tests, but the branches are now synced and have the same commit history.)
@sumitbindra you should be able to use the branch Bo lists above for RNG testing.
Just saw this and the discussion at https://github.com/ActivitySim/meeting-notes/issues/51. Was this test done by simply replacing the current RNG with a newer variant? If so, I am not surprised that there isn't much difference. I believe any meaningful runtime savings would come from not having to constantly re-seed the RNG (which would also do away with any additional shifting to get to the previous state). This can only be done by keeping a larger amount of RNGs in memory and seeding them once during construction. This would not be possible with MT, but newer RNGs can be much smaller memory-wise. See, e.g., https://numpy.org/doc/stable/reference/random/bit_generators/mt19937.html and https://numpy.org/doc/stable/reference/random/bit_generators/pcg64.html#numpy.random.PCG64 regarding internal state, and also https://numpy.org/doc/stable/reference/random/performance.html.