H5Fget_access_plist does not return a valid faplid
Summary
When using Cache Vol and Async Vol, it seems that H5Fget_access_plist does not return a valid faplid.
The returned id is non-negative but seems not a property list.
Error Details
% echo $HDF5_VOL_CONNECTOR
cache_ext config=cache_1.cfg;under_vol=512;under_info={under_vol=0;under_info={}}
% mpirun -n 1 ./test
HDF5-DIAG: Error detected in HDF5 (1.13.3-1) MPI-process 0:
#000: ../../hdf5-dev/src/H5Pfapl.c line 1487 in H5Pget_driver(): can't get driver
major: Property lists
minor: Can't get value
#001: ../../hdf5-dev/src/H5Pfapl.c line 1444 in H5P_peek_driver(): not a file access property list
major: Property lists
minor: Inappropriate type
#002: ../../hdf5-dev/src/H5Pint.c line 4067 in H5P_isa_class(): not a property list
major: Invalid arguments to routine
minor: Inappropriate type
Test Program
Click here to see the test program:
#include <hdf5.h>
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#define N 10
#define CHECK_ERR(A) {if (A < 0) { printf("Error at line %d: code %d\n", __LINE__, A); }}
int main(int argc, char **argv) {
herr_t err = 0;
int mpi_required;
const char *file_name = "test.h5";
hid_t fid = -1; // File ID
hid_t faplid = -1; // File Access Property List
hid_t plist_id = -1;
hid_t faplid2 = -1;
// init MPI
err = MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &mpi_required);
CHECK_ERR(err);
// create file
faplid = H5Pcreate(H5P_FILE_ACCESS);
CHECK_ERR(faplid);
H5Pset_fapl_mpio(faplid, MPI_COMM_WORLD, MPI_INFO_NULL);
fid = H5Fcreate(file_name, H5F_ACC_TRUNC, H5P_DEFAULT, faplid);
CHECK_ERR(fid);
// get faplid
faplid2 = H5Fget_access_plist (fid);
CHECK_ERR (faplid2);
plist_id = H5Pget_driver (faplid2); // Error occurs here
if (fid >= 0)
H5Fclose(fid);
if (faplid >= 0)
H5Pclose(faplid);
MPI_Finalize();
return 0;
}
Libraries Versions (commit number)
Click here to see the details
- HDF5 develop branch: HDFGroup/hdf5@b5598575bb8a2495d6f306233b00d612258ad718
- Argobots main branch: pmodels/argobots@dce6e727ffc4ca5b3ffc04cb9517c6689be51ec5
- AsyncVol develop branch: hpc-io/vol-async@0a92d232ed01ecbb6ab59fbfa4807458c88922a7
- Cache Vol develop branch: hpc-io/vol-cache@f453900b64cfbc5d3197acb5292e6e379ce2ac20
Will this issue be addressed soon?
Hi @wkliao @yzanhua, this is an issue of the HDF5 library. I encountered this when I was running E3SM-IO. I have to comment out H5Fget_access_plist in the code to make it running. I mentioned it to Neil before. Maybe report this to HDF5?
I am not sure whether this is HDF5's issue. @yzanhua testes the small program he provided in this PR using the followings. It failed only when using Cache+Async VOLs.
Cache+Async VOL: fail Cache VOL only: success Passthrough VOL only: success Log VOL only: success
using: HDF5: 1.13.3, Cache VOL: master branch Async VOL: v1.4
It also fails when using Async only. It seems like Async VOL (instead of Cache VOL) is not handling faplid correctly.
Yes, it is with Async + HDF5. @houjun , did you encounter this issue before?
Yes, I remember it is related to future ID when async is used, I'll take another look and check with HDF people.
The provided test program failed in H5Pget_driver (the line where the invalid faplid2 is first used). I also tested using other H5Pget_xxxxs to replace H5Pget_driver to see if the program still fails. The results might be helpful to debugging.
H5Pget_driver_info, H5Pget_fapl_mpio andH5Pget_fapl_core fail with the same error messages, complaining about "not a property list".
However, using H5Pget_fclose_degree and H5Pget_evict_on_close can run without a problem.