PSyclone Easier integration of psydata libraries into LFRic build system

ATM it is rather complicated to apply a psydata library with lfric. For example for nan testing:

Modify the psyclone transformation to apply a psydata transformation.
In the corresponding Makefile(s):
- export IGNORE_DEPENDENCIES= ... nan_test_psy_data_mod
- export EXTERNAL_STATIC_LIBRARIES = ..._nan_test
copy the corresponding .mod files 'somewhere' ... i.e. somewhere where the compiler is already looking for include files, or add the corresponding compiler flags so that it looks elsewhere.
Same for lib: the EXTERNAL_STATIC_LIBRARIES definition means that the -l is added, but we still need to either copy the lib to directory that is already specified using -L, or we need to add -L to look elsewhere.

Suggested use case: when building LFRic, you can specify an environment variable, e.g.: PSYCLONE_PSYDATA=nan (or profile, read_only, extraction). This will then:

exclude the nan (etc) module from dependency
add the nan library to be linked in
add a flag to PSyclone command line to apply the transformation (see #929)
sort out the location of the mod and library files.

Question (@iva, @rupertford, @arporter, @sergisiso): Is an environment variable the best option? I admit I don't have a better idea to add something to the build system that will be visible wherever we need it, but am happy for alternative solutions :)

Once we agree on the mechanism, there are a few different todo items:

[ ] #929 - adding a command line option (or maybe a config-file option: auto-apply=nan ??) would make it much easier to apply compared with having to modify the global.py optimisations scripts (I.e. the scripts could check for the above environment variable ... but that feels very clunky to me). We already have this for profile, it should be pretty trivial to support all PSyData transformations instead of only profiling. I think command-line option would be much better than a config option (otherwise we have so many different config files ...).
[ ] We need to find a place where to store the compiled libraries. One option might be to store them in the PSyclone installation (it already includes the library in its share directory). The disadvantage is that we might then need several PSyclone installs (for different compiler). Or we add code that copies these libraries into the LFRic build tree (from the PSyclone installation) - that might actually be great, since it could avoid having to disable the .mod files from the dependency analysis. The more I think about this, the more I like this solution :)
[ ] Potentially modify the build environment ... depending on where the mod and library files are.
[ ] We might want to agree on an 'standard' environment variable that points to the PSyclone installation location (so we can find the wrapper files if we want to copy them as indicated above). This variable would be set in the module that loads PSyclone (I wonder if we can auto-generate the module file somehow? )
[ ] We might also need to fix #1735 - atm we have to modify the Makefile for the psydata wrapper libraries, since they all add a dependency to the infrastructure. This is great for our environment, but bad with LFRic since it won't compile out-of-the-box, since the infrastructure in LFRic does not have a makefile (it points to the working/... directories). We also should take care of path: LFRic infrastructure has another level of subdirectories (field/field_mod..., mesh/mesh_mod.... - while our copy is flat).

Once we agree, I am happy to open a ticket on LFRic trac to track the required modifications to the build system.

Jul 12 '22 08:07 hiker

Copying the psydata files into the working directory works like a charm - no complains about dependencies, no issues linking (except https://github.com/stfc/PSyclone/issues/1797 - psydata needs to support logical values now).

Jul 12 '22 16:07 hiker

I've verified the 'copy into source' approach, and it works fine with gungho. I added lfric_trunk/infrastructure/source/psydata/nan/ as directory with the nan-testing files, and then built gungho with the following modified global.py file (note to self: the order of transformation is important - if nan testing is done too early, other transformations might fail. It might also be because the tree was modified in a loop, maybe it has to be done in a separate step loop? to be investigated):

18,19d17
< from psyclone.psyir.transformations import NanTestTrans
< from psyclone.psyir.nodes import Loop
33d30
<     nan_trans = NanTestTrans()
54d50
<             nan_trans.apply(loop)

No other changes were required, LFRI'c dependency analysis picked it up, built and linked it just fine. Running with PSYDATA_VERBOSE=1 verified that it was properly activated. This leaves the question about the best way to integrate this into LFRic. I think ideally we would copy all PSydata files into infrastructure directory from the PSyclone module (it shouldn't hurt if they are around and not being used - worst case there is a few ms of increased compile time), but I will discuss this with @TeranIvy when I am doing the lfric related changes.

Jul 13 '22 02:07 hiker

An environment variable sounds OK to me but perhaps @TeranIvy and @MatthewHambley can comment.

Just to say that the ability to build with support for NaN checking will be really helpful as it will allow the PSyAD test-harness generation to include it.

Jul 15 '22 11:07 arporter

If we are using more than one PSyData libraries at the same time, both source files will have its own copy of psy_data_base.f90, which the dependency code does not like (neither will the compiler I'd guess, though the various files might actually be identical), assuming they were created from the same jinja template with the same parameters.

Either rename the process jinja files to have a unique name
see if we can have only one copy of otherwise identical files??

Jul 19 '22 07:07 hiker

I'm trying to build the gravity-wave mini-app with NaN checking and am having real problems getting it to work. Did you have to rename the psy_data_base.f90 and nan_test_base.f90 files to fit with the LFRic naming convention (i.e. end in _mod.f90)?

Jan 16 '23 11:01 arporter

@arporter , I had to re-check that. While it is likely that I never tested the NAN library with the new build approach (just copying the source files into the infrastructure directories, in the past I modified the build system to ignore psydata libs, and then explicitly linked in the library).

But checking with my current build and the extraction library: no, the files actually do not follow the naming convention (my bad), but I see in the log files:

/home/joerg/work/lfric/trunk/infrastructure/build/tools/DependencyAnalyser \
    -ignore netcdf -ignore MPI -ignore yaxt -ignore pfunit_mod -ignore xios -ignore mod_wait -verbose dependencies.db psydata/extract/psy_data_base.f90
  Scanning psydata/extract/psy_data_base.f90
    Contains module psy_data_base_mod
touch psydata/extract/psy_data_base.t
printf "%s \\033[1mAnalysing\\033[0m %s\n" `date +%H:%M:%S` psydata/extract/kernel_data_netcdf.f90
12:55:15 Analysing psydata/extract/kernel_data_netcdf.f90
/home/joerg/work/lfric/trunk/infrastructure/build/tools/DependencyAnalyser \
    -ignore netcdf -ignore MPI -ignore yaxt -ignore pfunit_mod -ignore xios -ignore mod_wait -verbose dependencies.db psydata/extract/kernel_data_netcdf.f90
  Scanning psydata/extract/kernel_data_netcdf.f90
    Contains module extract_psy_data_mod
    Depends on module extract_netcdf_base_mod
    Depends on module field_r32_mod
    Depends on module field_r32_mod

So the LFRic dependency analyser finds the files just fine.

Can you tell me what the problems are? Here what I did (for extraction, but if anything, NAN should be simpler):

Created the f90 files from jinja in PSyclone (that's typically just make in the right directory, but then ignore the .o/.mod/.a files, you only need the .f90)
copy the created f90 files to lfric-trunk/infrastructure/source/psydata/extract/ (feel free to rename the directory to nan, but it actually shouldn't matter). I did need the extract directory level, just putting the files into psydata had problems later.
do a make clean (unfortunate because of the required time, but again I had some issues that the build system would not pick up the files properly.
Do a VERBOSE=1 make build | tee log (or so), and check for the handling of the nan etc files. You should see output similar to the output above.

Jan 16 '23 13:01 hiker

Additional todo item:

[ ] Rename the PSyData modules to follow LFRic naming conventions.

Nov 09 '23 02:11 hiker