gpufort
gpufort copied to clipboard
WiP: Support derived types and make kernels look more like original Fortran code
TBC
Major changes:
- New
GPUFORTRTruntime, replacesgpufort_acc_runtime: Complete C++ rewrite of thegpufort_acc_runtime- Adds C header file
gpufortrt_api.hand a Fortran modulegpufortrt_api.f90, so that it can be accessed from C/C++ and Fortran - Adds logging with 5+ log levels (controllable via environment variable)
- Adds C header file
Other changes:
-
bin/gpufortandbin/gpufortfc: Add options to only partially convert a file, e.g. allows to convert only OpenACC compute constructs while not translating any other OpenACC directives.
Changes (unsorted):
- Rewrite and breakup monolithic templates into individual macros. Use code generation to create python render methods from the macros. Allows to:
- more flexibly use templates.
- more easily test template-based code generation.
- use templates in other GPUFORT python modules.
- use GPUFORT functionality in other python apps
- Anticipate new kernel code generation backends:
- Split
fort2hipinto genericfort2xpart andfort2x.hippart - Put abstract codegen base classes and generators into
fort2xfolder and HIP specific parts into subfolderfort2x/hip.
- Split
- Add module
fort2x.hipthat provides routines for creating HIP code generators based on string input (Inputs: Fortran declaration list snippet & annotated Fortran loop snippet) - Replace
hip_autolauncher byhip_ps(ps: problem size), where the first argument is a problem sizedim3that is derived from the range of the translated loop nest. - Improve performance of linemapper's preprocessor by using python string & regex features instead of
pyparsingwhen evaluating macros or expressions. - CUDA Fortran
- Support of fixed size device and pinned arrays in programs and procedures.