FESTIM icon indicating copy to clipboard operation
FESTIM copied to clipboard

Update dolfinx 0.8

Open RemDelaporteMathurin opened this issue 1 year ago • 7 comments

This PR just updates the version of dolfinx to 0.8.0

RemDelaporteMathurin avatar Apr 26 '24 20:04 RemDelaporteMathurin

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 98.92%. Comparing base (8ebaedd) to head (4249ef7). Report is 12 commits behind head on fenicsx.

Additional details and impacted files
@@           Coverage Diff            @@
##           fenicsx     #764   +/-   ##
========================================
  Coverage    98.92%   98.92%           
========================================
  Files           36       36           
  Lines         1580     1580           
========================================
  Hits          1563     1563           
  Misses          17       17           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Apr 26 '24 20:04 codecov[bot]

There is a random error ocuring on both conda and docker when running test_xdmf.py. Although it seems that all the tests pass in this file so I don't really know what is going on here.

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

RemDelaporteMathurin avatar Apr 26 '24 21:04 RemDelaporteMathurin

The error that we are seeing here might be related to https://github.com/FEniCS/dolfinx/issues/3162

We'll wait for a fix before merging this

RemDelaporteMathurin avatar Apr 29 '24 12:04 RemDelaporteMathurin

There is now 0.8.1 released, not sure if that helps with this bug?

jhdark avatar Apr 30 '24 09:04 jhdark

I've narrowed down the error to using

my_model.solver.convergence_criterion = "incremental"
ksp = my_model.solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "cg"
opts[f"{option_prefix}pc_type"] = "gamg"
opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
ksp.setFromOptions()

Running just test_multispecies_problem gives:

Solving H transport problem: 100%|█████████████████████████████████████████████████████████████████████████| 10.0/10.0 [00:02<00:00, 4.74it/s]
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There is one unused database option. It is:
Option left: name:-nls_solve_pc_factor_mat_solver_type value: mumps source: code

RemDelaporteMathurin avatar May 01 '24 13:05 RemDelaporteMathurin

MWE to reproduce the random segmentation fault

import numpy as np
import festim as F
from petsc4py import PETSc

for i in range(2):

    my_model = F.HydrogenTransportProblem()
    my_model.mesh = F.Mesh1D(np.linspace(0, 1, num=1000))

    my_mat = F.Material(D_0=1.9e-7, E_D=0.2, name="my_mat")
    my_subdomain = F.VolumeSubdomain1D(id=1, borders=[0, 1], material=my_mat)
    my_model.subdomains = [my_subdomain]

    my_model.species = [F.Species("H")]

    my_model.temperature = 500

    my_model.settings = F.Settings(atol=1e10, rtol=1e-10, transient=False)

    my_model.initialise()

    my_model.solver.convergence_criterion = "incremental"
    ksp = my_model.solver.krylov_solver
    opts = PETSc.Options()
    option_prefix = ksp.getOptionsPrefix()
    opts[f"{option_prefix}ksp_type"] = "cg"
    opts[f"{option_prefix}pc_type"] = "gamg"
    opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
    ksp.setFromOptions()

    my_model.run()

Randomly produces

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://petsc.org/release/faq/#valgrind and https://petsc.org/release/faq/
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: Run with -malloc_debug to check if memory corruption is causing the crash.
Abort(59) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

RemDelaporteMathurin avatar May 01 '24 17:05 RemDelaporteMathurin

I noticed that removing the lines

my_model.solver.convergence_criterion = "incremental"
ksp = my_model.solver.krylov_solver
opts = PETSc.Options()
option_prefix = ksp.getOptionsPrefix()
opts[f"{option_prefix}ksp_type"] = "cg"
opts[f"{option_prefix}pc_type"] = "gamg"
opts[f"{option_prefix}pc_factor_mat_solver_type"] = "mumps"
ksp.setFromOptions()

removes the random segfault

RemDelaporteMathurin avatar May 01 '24 17:05 RemDelaporteMathurin