GATE 9.2 crashes with Water volume
Describe the bug We are trying to migrate from version 9.1 to 9.2 and our simulation crashes with a segmentation violation. I identified that the issue is the Water material. As soon as the material changes to anything else, the simulation succeeds. We are using a standard water definition and the Materials.xml from the 9.2 tag in this repository. The error message looks as follows:
### CAUGHT SIGNAL: 11 ### address: 0x46299c0, signal = SIGSEGV, value = 11, description = segmentation violation. Invalid permissions for mapped object.
Backtrace:
[PID=2123180, TID=-2][0/1]> p
: Segmentation fault (Invalid permissions for mapped object [0x46299c0])
/runGate.sh: line 4: 2123180 Aborted (core dumped) Gate $1
Desktop (please complete the following information):
- OS: Ubuntu 20.04
- Docker image opengatecollaboration/gate:9.2-docker run with Singularity
Minimal example I created a repository with a minimal example and instructions how to run it: https://github.com/alex-a-nerd/gate-9-2-water-issue. Here is the main macro file:
/gate/geometry/setMaterialDatabase GateMaterials.db
/gate/world/geometry/setXLength 3500.0 mm
/gate/world/geometry/setYLength 3500.0 mm
/gate/world/geometry/setZLength 3500.0 mm
/gate/world/daughters/name phantom
/gate/world/daughters/insert box
/gate/phantom/geometry/setXLength 200.0 mm
/gate/phantom/geometry/setYLength 200.0 mm
/gate/phantom/geometry/setZLength 160.0 mm
/gate/phantom/setMaterial Water
/gate/world/daughters/name detectorContainer
/gate/world/daughters/systemType scanner
/gate/world/daughters/insert box
/gate/detectorContainer/geometry/setXLength 450.0 mm
/gate/detectorContainer/geometry/setYLength 450.0 mm
/gate/detectorContainer/geometry/setZLength 450.0 mm
/gate/detectorContainer/placement/setTranslation 0 0 450 mm
/gate/detectorContainer/setMaterial Air
/gate/detectorContainer/daughters/name detector
/gate/detectorContainer/daughters/insert box
/gate/detector/geometry/setXLength 200.0 mm
/gate/detector/geometry/setYLength 200.0 mm
/gate/detector/geometry/setZLength 1.0 mm
/gate/detector/setMaterial Silicon
/gate/detector/attachCrystalSD
/gate/physics/addPhysicsList QBBC_EMZ
/gate/run/initialize
/gate/source/addSource beam PencilBeam
/gate/source/beam/setEnergy 230 MeV
/gate/source/beam/setSigmaEnergy 0 MeV
/gate/source/beam/setPosition 0 0 -500 mm
/gate/source/beam/setSigmaX 2 mm
/gate/source/beam/setSigmaY 2 mm
/gate/source/beam/setSigmaTheta 2.5 mrad
/gate/source/beam/setSigmaPhi 2.5 mrad
/gate/source/beam/setRotationAxis 0 1 0
/gate/source/beam/setRotationAngle 0 rad
/gate/source/beam/setEllipseXThetaEmittance 3 mm*mrad
/gate/source/beam/setEllipseXThetaRotationNorm negative
/gate/source/beam/setEllipseYPhiEmittance 3 mm*mrad
/gate/source/beam/setEllipseYPhiRotationNorm negative
/gate/source/beam/setParticleType proton
/gate/output/allowNoOutput
/gate/application/setTotalNumberOfPrimaries 1
/gate/application/start
Additional context The example succeeds with any of the following changes:
- Run on GATE 9.1
- Remove the Water section from Materials.xml
- Change the material of the "phantom" volume to Air
Thank you for this precise description and minimal example! I think this is a "double free" problem. At some point of time, GateMaterialDatabase::ReadMaterialFromDBFile with materialName="Water" is called and then the following happens:
-
The
G4MaterialPropertiesTable *GateMaterialDatabase::water_MPTis allocated here. -
Two properties are added and then there is a line
material->SetMaterialPropertiesTable(water_MPT);Note that this function (see the code) does not only store
water_MPTinmaterial->fMaterialPropertiesTable, but alsodeletes the pointer that was there before. At this point of time this is not problematic, as the prior value wasNULL. -
But a bit later in this line, there is again a call
material->SetMaterialPropertiesTable(table);So this time,
water_MPTisdelete'd. -
When the destructor of
GateMaterialDatabaseis called, it tries to againdelete water_MPTin this line.
The problem can be mitigated by removing the delete water_MPT statement in the destructor of GateMaterialDatabase. This might leak memory because fMaterialPropertiesTable is not delete'd in the destructor of G4Material (see the code). I think this is acceptable because the function is not called too often, or is it?
A better fix would require to make clear whether a G4Material instance owns the object that its member fMaterialPropertiesTable points to. If it does own it, it must delete it in its destructor. If it does not own it, it cannot delete it in SetMaterialPropertiesTable.
Hi Max and Alex, thanks for reporting. I think the first solution is simpler, can you try with your examples and, if succeeded, propose PR, please ?
Of course any memory issues should be fixed (btw @maxaehle I think that you are doing really awesome work with valgrind!) but apart from that, I remember that some time ago we decided to remove "Water" from the Gate material database because users should use G4_WATER instead, the standard geant4 material. But when I "git pull" the latest version of Gate, I see that "Water" is actually still in the database. Did I forget to remove it? I see that "Air" is definitely gone. Question to @alex-a-nerd : does your macro crash if you use "G4_WATER" instead of "Water"?
It works with G4_WATER. I missed the memo about Water not being usable any longer. I will use this moving forward.
If Water is not supposed to exist, we can see if the memory problem even needs fixing or if that is the solution.
Air is still in Materials.xml, by the way. When removing Water, it should then also be removed from Materials.xml along with Air.
Thanks for your help!
If Water is not supposed to exist, we can see if the memory problem even needs fixing or if that is the solution.
If materialName is G4_WATER, the function GateMaterialDatabase::ReadMaterialFromDBFile returns already in this line and therefore does not reach the if block where water_MPT is set, so there is no "double free" any more.
Actually right now, the body of the if block will never execute for valid input:
- If
materialNameisG4_WATER, the function returns earlier as stated above. -
materialNamebeingWateris an invalid input. - For any other
materialNamethe condition is false.
I wonder if the if block, which sets RINDEX and ABSLENGTH for water, should be moved upwards in front of the return statement. Also for other NIST materials, right now the Materials.xml is not consulted and these quantities are not set.