dolfinx icon indicating copy to clipboard operation
dolfinx copied to clipboard

IndexMap's cosntructor hangs if there is a mismatch in the destination ranks

Open IgorBaratta opened this issue 4 years ago • 1 comments

Minimum failing example:

int main(int argc, char* argv[])
{
  common::subsystem::init_logging(argc, argv);
  common::subsystem::init_petsc(argc, argv);

  MPI_Comm mpi_comm{MPI_COMM_WORLD};

  int mpi_size, mpi_rank;
  MPI_Comm_size(mpi_comm, &mpi_size);
  MPI_Comm_rank(mpi_comm, &mpi_rank);

  const int size_local = 100;

  // Create some ghost entries on next process
  std::vector<std::int64_t> ghosts(1);
  ghosts[0] = (mpi_rank + 1) % mpi_size * size_local + 1;
  std::vector<int> global_ghost_owner(ghosts.size(), (mpi_rank + 1) % mpi_size);

  // Compute destination edges
  auto dest_edges = dolfinx::MPI::compute_graph_edges(
      MPI_COMM_WORLD,
      std::set<int>(global_ghost_owner.begin(), global_ghost_owner.end()));
  
  // Add an extra edge or remove with pop_back
  if (mpi_rank == 0)
    dest_edges.push_back(1);

  common::IndexMap idx_map(MPI_COMM_WORLD, size_local, dest_edges, ghosts,
                           global_ghost_owner);

common::subsystem::finalize_petsc();

return 0;
}

It hangs in compute_owned_shared on the first MPI_Neighbor_alltoall.

IgorBaratta avatar Feb 25 '21 10:02 IgorBaratta

Using init_mpi instead of init_petsc, I get the following error message:

An error occurred in MPI_Neighbor_alltoall
*** reported by process [305790977,0]
*** on communicator MPI COMMUNICATOR 4 CREATE FROM 0
*** MPI_ERR_OTHER: known error not in list
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job

IgorBaratta avatar Feb 25 '21 10:02 IgorBaratta

This is caused by user error? Is there a low cost check?

@IgorBaratta

garth-wells avatar Oct 25 '23 12:10 garth-wells

I think this can only be caused by user error (non consistent input). Checking for correctness of inputs requires communication, so it would be expensive.

IgorBaratta avatar Oct 30 '23 11:10 IgorBaratta