Weird behaviour while loading EESSIE with multiples modulefiles
Hi,
I have a very strange behaviour while trying to load EESSIE (source /cvmfs/software.eessi.io/versions/2023.06/init/lmod/bash):
stack traceback:
[C]: in function 'next'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: in function 'l_readCacheFile'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:561: in function 'build'
...mpat/linux/x86_64/usr/share/Lmod/libexec/ModuleA.lua:685: in function 'singleton'
...compat/linux/x86_64/usr/share/Lmod/libexec/MName.lua:183: in function 'l_lazyEval'
...compat/linux/x86_64/usr/share/Lmod/libexec/MName.lua:262: in function 'sn'
...6/compat/linux/x86_64/usr/share/Lmod/libexec/Hub.lua:312: in function 'load'
.../linux/x86_64/usr/share/Lmod/libexec/MainControl.lua:1055: in function 'load'
.../linux/x86_64/usr/share/Lmod/libexec/MainControl.lua:1031: in function 'load_usr'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:550: in function 'l_usrLoad'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:578: in function 'Load_Usr'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:711: in function 'Reset'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:867: in function 'cmd'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:514: in function 'main'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:585: in main chunk
[C]: ?
Basically, we have a lot of modulefiles elsewhere, which are overwritten by sourcing this file (this should be in another issue).
module --version
Modules based on Lua: Version 8.7.23 2023-03-29 17:19 -05:00
by Robert McLay [email protected]
We are using OpenHPC with Rocky Linux. /etc/os-release
NAME="Rocky Linux"
VERSION="9.4 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.4"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.4 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2032-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.4"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.4"
Any idea would be helpful.
Before you run the source command, which modules are loaded? Can you share the output of module list?
Hi @boegel
[dernatr@io-login-02 ~]$ module list
No modules loaded
[dernatr@io-login-02 ~]$ type module
module is a function
module ()
{
if [ -z "${LMOD_SH_DBG_ON+x}" ]; then
case "$-" in
*v*x*)
__lmod_sh_dbg='vx'
;;
*v*)
__lmod_sh_dbg='v'
;;
*x*)
__lmod_sh_dbg='x'
;;
esac;
fi;
if [ -n "${__lmod_sh_dbg:-}" ]; then
set +$__lmod_sh_dbg;
echo "Shell debugging temporarily silenced: export LMOD_SH_DBG_ON=1 for Lmod's output" 1>&2;
fi;
eval "$($LMOD_CMD shell "$@")" && eval "$(${LMOD_SETTARG_CMD:-:} -s sh)";
__lmod_my_status=$?;
if [ -n "${__lmod_sh_dbg:-}" ]; then
echo "Shell debugging restarted" 1>&2;
set -$__lmod_sh_dbg;
fi;
unset __lmod_sh_dbg;
return $__lmod_my_status
}
Thank you !
If it can be helpful:
[root@io-login-02 ~]# rpm -qa |grep -E "lua|lmod"
lua-libs-5.4.4-4.el9.x86_64
lua-srpm-macros-1-6.el9.noarch
lua-posix-35.0-8.el9.x86_64
lua-5.4.4-4.el9.x86_64
lua-filesystem-1.8.0-5.el9.x86_64
lmod-ohpc-8.7.53-320.ohpc.3.1.x86_64
mod_lua-2.4.57-11.el9_4.1.x86_64
Can you also share the output of ml --config?
It seems like there's a problem with reading an Lmod spider cache (though it's unclear which one).
In the mean time, you can use the other way of setting up the EESSI environment, by source the non-Lmod init script instead:
source /cvmfs/software.eessi.io/versions/2023.06/init/bash
Thanks for your answer @boegel !
Indeed, if I clear the cache, it seems to work fine (with sourcing the non lmod file).
Output of ml --config
Modules based on Lua: Version 8.7.53 2024-10-12 19:57 -05:00
by Robert McLay [email protected]
Description Value
----------- -----
Allow root to use Lmod (LMOD_ALLOW_ROOT_USE) yes
Allow TCL modulefiles (LMOD_ALLOW_TCL_FILES) yes
Auto swapping (LMOD_AUTO_SWAP) no
Avail Style (LMOD_AVAIL_STYLE) <system>
Case Independent Sorting (LMOD_CASE_INDEPENDENT_SORTING) no
Colorize Lmod (LMOD_COLORIZE) no
Configuration dir (LMOD_CONFIG_DIR) /etc/lmod
Disable Same Name AutoSwap (LMOD_DISABLE_SAME_NAME_AUTOSWAP) no
Display Extension w/ avail (LMOD_AVAIL_EXTENSIONS) yes
Use ~/.config dir only (LMOD_USE_DOT_CONFIG_ONLY) no
Downstream Module Conflicts (LMOD_DOWNSTREAM_CONFLICTS) no
Allow duplicate paths (LMOD_DUPLICATE_PATHS) no
Dynamic Spider Cache (LMOD_DYNAMIC_SPIDER_CACHE) yes
Require Exact Match/no defaults (LMOD_EXACT_MATCH) no
Export the module command (LMOD_EXPORT_MODULE) yes
Allow extended default (LMOD_EXTENDED_DEFAULT) yes
Use attached TCL over system call (LMOD_FAST_TCL_INTERP) yes
Is fast TCL interp available (LMOD_USING_FAST_TCL_INTERP) yes
File ignore patterns (LMOD_FILE_IGNORE_PATTERNS) {"%.version[-._].*", "%.modulerc[-._].*"}
Use italic instead of dim (LMOD_HIDDEN_ITALIC) no
KSH Support (LMOD_KSH_SUPPORT) no
Language used for err/msg/warn (LMOD_LANG) en
Site message file (LMOD_SITE_MSG_FILE) <empty>
LD_LIBRARY_PATH at config time (LMOD_LD_LIBRARY_PATH) <empty>
LD_PRELOAD at config time (LMOD_LD_PRELOAD) <empty>
LuaFileSystem version 1.8.0
Lmod version 8.7.53
Lmod branch (LMOD_BRANCH) main
lmod_config.lua location (LMOD_CONFIG_LOCATION) no
Lua Version Lua 5.4
LUA_CPATH /usr/lib64/lua/5.4/?.so;/usr/lib64/lua/5.4/loadall.so;
LUA_PATH /usr/share/lua/5.4/?.lua;/usr/share/lua/5.4/?/init.lua;/usr/lib64/lua/5.4/?.lua;/usr/lib64/lua/5.4/?/init.lua
System lua-term (LMOD_HAVE_LUA_TERM) no
Active lua-term true
Modules Auto Handling (MODULES_AUTO_HANDLING) no
MODULERC (LMOD_MODULERC) /opt/ohpc/admin/lmod/etc/rc -> <empty>
avail: Include modulepath dir (LMOD_MPATH_AVAIL) no
MODULEPATH_INIT (LMOD_MODULEPATH_INIT) /opt/ohpc/admin/lmod/lmod/init/.modulespath -> <empty>
MODULEPATH_ROOT (MODULEPATH_ROOT) /opt/ohpc/admin/modulefiles
NAG File (LMOD_ADMIN_FILE) /opt/ohpc/admin/lmod/etc/admin.list
number of cache dirs 0
OS Name Rocky Linux 9.4 (Blue Onyx)
Pager (LMOD_PAGER) /usr/bin/more
Pager Options (LMOD_PAGER_OPTS) -XqMREF
Path to HashSum (LMOD_HASHSUM_PATH) /usr/bin/sha1sum
Path to Lua /usr/bin/lua
Pin Versions in restore (LMOD_PIN_VERSIONS) no
Pkg Class name Pkg
Lmod prefix /opt/ohpc/admin
Site controlled prefix (SITE_CONTROLLED_PREFIX) no
Prepend order (LMOD_PREPEND_BLOCK) normal
LMOD_RC (LMOD_RC) <empty>
Redirect to stdout (LMOD_REDIRECT) yes
Supporting Full Settarg Use (LMOD_SETTARG_FULL_SUPPORT) no
User shell bash
Site Name (LMOD_SITE_NAME) <empty>
Site Pkg location standard
Ignore Cache (LMOD_IGNORE_CACHE) no
Cached loads (LMOD_CACHED_LOADS) no
System Default Modules (LMOD_SYSTEM_DEFAULT_MODULES) <empty>
System Name (LMOD_SYSTEM_NAME) <empty>
SYSHOST (cluster name) (LMOD_SYSHOST) <empty>
TCL Version 8.6.10
Terse Decorations (LMOD_TERSE_DECORATIONS) yes
User cache valid time(sec) (LMOD_ANCIENT_TIME) 86400
Write cache after (sec) (LMOD_SHORT_TIME) 2
Threshold (sec) (LMOD_THRESHOLD) 1
Tmod find first rule (LMOD_TMOD_PATH_RULE) no
Tmod prepend PATH Rule (LMOD_TMOD_PATH_RULE) no
Tracing (LMOD_TRACING) no
uname -a Linux io-login-02.io.internal 5.14.0-427.42.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Oct 31 14:01:51 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
User Cache Directory /root/.cache/lmod
Admin file /opt/ohpc/admin/lmod/etc/admin.list
Changes from Default Configuration
----------------------------------
Name Where Set Default Value
---- --------- ------- -----
LFS_VERSION D 1.6.3 1.8.0
LMOD_AUTO_SWAP C yes no
LMOD_COLORIZE E yes no
LMOD_PACKAGE_PATH D nil <empty>
LMOD_PAGER C less /usr/bin/more
LMOD_REDIRECT C no yes
LMOD_SITEPACKAGE_LOCATION Other /opt/ohpc/admin/lmod/8.7.53/libexec/SitePackage.lua <srctree>
LMOD_SYSTEM_DEFAULT_MODULES D __unknown__ <empty>
LMOD_TCLSH C tclsh /usr/bin/tclsh
MODULEPATH_ROOT C /opt/ohpc/admin/modulefiles
PATH_TO_LUA C lua /usr/bin/lua
Where Set -> D: default, E: environment, C: configuration
lmod_cfg: lmod_config.lua SitePkg: SitePackage StdPkg: StandardPackage
Other: Set somewhere outside of normal locations
Active RC file(s):
------------------
/opt/ohpc/admin/lmod/8.7.53/init/lmodrc.lua
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Lmod Property Table (LMOD_RC):
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
propT = {
arch = {
displayT = {
gpu = {
color = "red",
doc = "built for GPU",
full_color = false,
long = "(g)",
short = "(g)",
},
["gpu:mic"] = {
color = "red",
doc = "built natively for MIC and GPU",
full_color = false,
long = "(g,m)",
short = "(gm)",
},
["gpu:mic:offload"] = {
color = "red",
doc = "built natively for MIC and GPU and offload to the MIC",
full_color = false,
long = "(g,m,o)",
short = "(@)",
},
mic = {
color = "blue",
doc = "built for host and native MIC",
full_color = false,
long = "(m)",
short = "(m)",
},
["mic:offload"] = {
color = "blue",
doc = "built for host, native MIC and offload to the MIC",
full_color = false,
long = "(m,o)",
short = "(*)",
},
offload = {
color = "blue",
doc = "built for offload to the MIC only",
full_color = false,
long = "(o)",
short = "(o)",
},
},
validT = {
gpu = 1,
mic = 1,
offload = 1,
},
},
lmod = {
displayT = {
sticky = {
color = "red",
doc = "Module is Sticky, requires --force to unload or purge",
long = "(S)",
short = "(S)",
},
},
validT = {
sticky = 1,
},
},
state = {
displayT = {
experimental = {
color = "blue",
doc = "Experimental",
long = "(E)",
short = "(E)",
},
obsolete = {
color = "red",
doc = "Obsolete",
long = "(O)",
short = "(O)",
},
testing = {
color = "green",
doc = "Testing",
long = "(T)",
short = "(T)",
},
},
validT = {
experimental = 1,
obsolete = 1,
testing = 1,
},
},
status = {
displayT = {
active = {
color = "yellow",
doc = "Module is loaded",
long = "(L)",
short = "(L)",
},
},
validT = {
active = 1,
},
},
}
If I source the non lmod file without clearing the cache, it works, but module av is failing with:
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/lua5.1: ...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: bad argument #1 to 'next' (table expected, got boolean)
stack traceback:
[C]: in function 'next'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: in function 'l_readCacheFile'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:561: in function 'build'
...mpat/linux/x86_64/usr/share/Lmod/libexec/ModuleA.lua:685: in function 'singleton'
...6/compat/linux/x86_64/usr/share/Lmod/libexec/Hub.lua:1134: in function 'avail'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:145: in function 'cmd'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:514: in function 'main'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:585: in main chunk
Would it be a better way internally with EESSIE or outside of it to check the cache and clear it before loading it ? And does it mean we cannot mix many modules configurations ?
EDIT: EESSIE loading works by clearing the cache, but the modules themselves cannot be load afterwards...
module load PyTorch/2.1.2-foss-2023a
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/lua5.1: ...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: bad argument #1 to 'next' (table expected, got boolean)
stack traceback:
[C]: in function 'next'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: in function 'l_readCacheFile'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:561: in function 'build'
...mpat/linux/x86_64/usr/share/Lmod/libexec/ModuleA.lua:685: in function 'singleton'
...compat/linux/x86_64/usr/share/Lmod/libexec/MName.lua:183: in function 'l_lazyEval'
...compat/linux/x86_64/usr/share/Lmod/libexec/MName.lua:262: in function 'sn'
...6/compat/linux/x86_64/usr/share/Lmod/libexec/Hub.lua:312: in function 'load'
.../linux/x86_64/usr/share/Lmod/libexec/MainControl.lua:1055: in function 'load'
.../linux/x86_64/usr/share/Lmod/libexec/MainControl.lua:1031: in function 'load_usr'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:550: in function 'l_usrLoad'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:578: in function 'cmd'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:514: in function 'main'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:585: in main chunk
[C]: ?
That cache issue is really weird. This time, I am able to load the module (tested with gnuplot and pytorch modules). I think I just have to be sure my environment is clean and cleared before using it...
@remyd1 Can you try running Lmod in debug mode when the crash happens, catch the output, and share the file (it'll be quite big). Something like:
module -DDD load example 2>&1 | tee lmod-debug.out
Ho @boegel ,
Sorry for the delay, I was on vacations.
Description Value
----------- -----
Allow root to use Lmod (LMOD_ALLOW_ROOT_USE) yes
Allow TCL modulefiles (LMOD_ALLOW_TCL_FILES) yes
Auto swapping (LMOD_AUTO_SWAP) yes
Avail Style (LMOD_AVAIL_STYLE) <system>
Case Independent Sorting (LMOD_CASE_INDEPENDENT_SORTING) yes
Colorize Lmod (LMOD_COLORIZE) no
Configuration dir (LMOD_CONFIG_DIR) /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod
Disable Same Name AutoSwap (LMOD_DISABLE_SAME_NAME_AUTOSWAP) no
Display Extension w/ avail (LMOD_AVAIL_EXTENSIONS) yes
Use ~/.config dir only (LMOD_USE_DOT_CONFIG_ONLY) no
Allow duplicate paths (LMOD_DUPLICATE_PATHS) no
Dynamic Spider Cache (LMOD_DYNAMIC_SPIDER_CACHE) yes
Require Exact Match/no defaults (LMOD_EXACT_MATCH) no
Export the module command (LMOD_EXPORT_MODULE) yes
Allow extended default (LMOD_EXTENDED_DEFAULT) yes
Use attached TCL over system call (LMOD_FAST_TCL_INTERP) yes
Is fast TCL interp available (LMOD_USING_FAST_TCL_INTERP) yes
Use italic instead of dim (LMOD_HIDDEN_ITALIC) no
KSH Support (LMOD_KSH_SUPPORT) yes
Language used for err/msg/warn (LMOD_LANG) en
Site message file (LMOD_SITE_MSG_FILE) <empty>
LD_LIBRARY_PATH at config time (LMOD_LD_LIBRARY_PATH) <empty>
LD_PRELOAD at config time (LMOD_LD_PRELOAD) <empty>
LuaFileSystem version 1.8.0
Lmod version 8.7.23
lmod_config.lua location (LMOD_CONFIG_LOCATION) no
Lua Version Lua 5.1
LUA_CPATH /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/?.so;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/loadall.so
LUA_PATH /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/lua/5.1/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/lua/5.1/?/init.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/?/init.lua
System lua-term (LMOD_HAVE_LUA_TERM) yes
Active lua-term true
MODULERCFILE (LMOD_MODULERCFILE) /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/../etc/rc -> <empty>
avail: Include modulepath dir (LMOD_MPATH_AVAIL) no
MODULEPATH_INIT (LMOD_MODULEPATH_INIT) /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/init/.modulespath -> <empty>
MODULEPATH_ROOT (MODULEPATH_ROOT) /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/etc/modulefiles
number of cache dirs 2
OS Name Gentoo Linux
Pager (LMOD_PAGER) /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/less
Pager Options (LMOD_PAGER_OPTS) -XqMREF
Path to HashSum (LMOD_HASHSUM_PATH) /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/sha1sum
Path to Lua /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/lua5.1
Pin Versions in restore (LMOD_PIN_VERSIONS) no
Pkg Class name Pkg
Lmod prefix /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod
Site controlled prefix (SITE_CONTROLLED_PREFIX) yes
Prepend order (LMOD_PREPEND_BLOCK) normal
LMOD_RC (LMOD_RC) <empty>
Redirect to stdout (LMOD_REDIRECT) no
Supporting Full Settarg Use (LMOD_SETTARG_FULL_SUPPORT) no
User shell bash
Site Name (LMOD_SITE_NAME) Gentoo
Site Pkg location /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/SitePackage.lua
Ignore Cache (LMOD_IGNORE_CACHE) no
Cached loads (LMOD_CACHED_LOADS) yes
System Default Modules (LMOD_SYSTEM_DEFAULT_MODULES) <empty>
System Name (LMOD_SYSTEM_NAME) <empty>
SYSHOST (cluster name) (LMOD_SYSHOST) Gentoo
TCL Version 8.6.13
User cache valid time(sec) (LMOD_ANCIENT_TIME) 86400
Write cache after (sec) (LMOD_SHORT_TIME) 2
Threshold (sec) (LMOD_THRESHOLD) 1
Tmod find first rule (LMOD_TMOD_PATH_RULE) no
Tmod prepend PATH Rule (LMOD_TMOD_PATH_RULE) no
Tracing (LMOD_TRACING) no
uname -a Linux io-login-01.io.internal 5.14.0-427.42.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Oct 31 14:01:51 UTC 2024 x86_64 AMD EPYC-Genoa Processor AuthenticAMD GNU/Linux
User Cache Directory /home/dernatr/.cache/lmod
Admin file /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/../etc/admin.list
Changes from Default Configuration
----------------------------------
Name Where Set Default Value
---- --------- ------- -----
LFS_VERSION D 1.6.3 1.8.0
LMOD_CACHED_LOADS D no yes
LMOD_CASE_INDEPENDENT_SORTING C no yes
LMOD_COLORIZE E yes no
LMOD_CONFIG_DIR E /etc/lmod /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod
LMOD_HAVE_LUA_TERM C no yes
LMOD_KSH_SUPPORT C no yes
LMOD_PACKAGE_PATH D nil /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod
LMOD_PAGER C less /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/less
LMOD_SITEPACKAGE_LOCATION Other /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/SitePackage.lua /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/SitePackage.lua
LMOD_SITE_NAME C false Gentoo
LMOD_SYSHOST C false Gentoo
LMOD_SYSTEM_DEFAULT_MODULES D __unknown__ <empty>
LMOD_TCLSH C tclsh /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/tclsh
MODULEPATH_ROOT C /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/etc/modulefiles
PATH_TO_LUA C lua /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/lua5.1
SITE_CONTROLLED_PREFIX C no yes
Where Set -> D: default, E: environment, C: configuration
lmod_cfg: lmod_config.lua SitePkg: SitePackage StdPkg: StandardPackage
Other: Set somewhere outside of normal locations
Active RC file(s):
------------------
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/../init/lmodrc.lua
/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/lmodrc.lua
Cache Directory Time Stamp File
--------------- ---------------
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/etc/lmod_cache/spider_cache /cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/etc/lmod_cache/system.txt
/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/cache /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/cache/timestamp
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Lmod Property Table:
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
propT = {
arch = {
displayT = {
gpu = {
color = "red",
doc = "built for GPU",
full_color = false,
long = "(g)",
short = "(g)",
},
["gpu:mic"] = {
color = "red",
doc = "built natively for MIC and GPU",
full_color = false,
long = "(g,m)",
short = "(gm)",
},
["gpu:mic:offload"] = {
color = "red",
doc = "built natively for MIC and GPU and offload to the MIC",
full_color = false,
long = "(g,m,o)",
short = "(@)",
},
mic = {
color = "blue",
doc = "built for host and native MIC",
full_color = false,
long = "(m)",
short = "(m)",
},
["mic:offload"] = {
color = "blue",
doc = "built for host, native MIC and offload to the MIC",
full_color = false,
long = "(m,o)",
short = "(*)",
},
offload = {
color = "blue",
doc = "built for offload to the MIC only",
full_color = false,
long = "(o)",
short = "(o)",
},
},
validT = {
gpu = 1,
mic = 1,
offload = 1,
},
},
lmod = {
displayT = {
sticky = {
color = "red",
doc = "Module is Sticky, requires --force to unload or purge",
long = "(S)",
short = "(S)",
},
},
validT = {
sticky = 1,
},
},
state = {
displayT = {
experimental = {
color = "blue",
doc = "Experimental",
long = "(E)",
short = "(E)",
},
obsolete = {
color = "red",
doc = "Obsolete",
long = "(O)",
short = "(O)",
},
testing = {
color = "green",
doc = "Testing",
long = "(T)",
short = "(T)",
},
},
validT = {
experimental = 1,
obsolete = 1,
testing = 1,
},
},
status = {
displayT = {
active = {
color = "yellow",
doc = "Module is loaded",
long = "(L)",
short = "(L)",
},
},
validT = {
active = 1,
},
},
}
lmod(-DDD load PyTorch/2.1.2-foss-2023a){
Date: Fri May 2 09:50:44 2025
Hostname: io-login-01.io.internal
System: Linux 5.14.0-427.42.1.el9_4.x86_64
Version: #1 SMP PREEMPT_DYNAMIC Thu Oct 31 14:01:51 UTC 2024
Lua Version: 5.1
Lmod Version: 8.7.23 2023-03-29 17:19 -05:00
package.path: /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/?.lua;/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/?/init.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/../tools/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/../tools/?/init.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/../shells/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/?/init.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/lua/5.1/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/lua/5.1/?/init.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/?.lua;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/?/init.lua
package.cpath: /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/../lib/?.so;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/share/Lmod/libexec/../lib/?.so;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/?.so;/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/lib64/lua/5.1/loadall.so
lmodPath: /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod
LOADEDMODULES: nil
shellNm: bash, Shell:name(): bash
Calling Hub:singleton(checkMPATH) w checkMPATH: true
Hub:singleton(safe: true){
s_hub: table: 0x55efe062a3c0, safe: true
} Hub:singleton
cmd name: load
Load_Usr(PyTorch/2.1.2-foss-2023a){
FrameStk:l_new(){
MT:singleton(){
getMT: Sz: 2
getMT: nm:_ModuleTable001_, v: X01vZHVsZVRhYmxlXyA9IHsKTVR2ZXJzaW9uID0gMywKY19yZWJ1aWxkVGltZSA9IGZhbHNlLApjX3Nob3J0VGltZSA9IGZhbHNlLApkZXB0aFQgPSB7fSwKZmFtaWx5ID0ge30sCm1UID0ge30sCm1wYXRoQSA9IHsKIi9jdm1mcy9zb2Z0d2FyZS5lZXNzaS5pby9ob3N0X2luamVjdGlvbnMvMjAyMy4wNi9zb2Z0d2FyZS9saW51eC94ODZfNjQvYW1kL3plbjQvbW9kdWxlcy9hbGwiLCAiL2N2bWZzL3NvZnR3YXJlLmVlc3NpLmlvL3ZlcnNpb25zLzIwMjMuMDYvc29mdHdhcmUvbGludXgveDg2XzY0L2FtZC96ZW40L21vZHVsZXMvYWxsIiwgIi9ldGMvc2NsL21vZHVsZWZpbGVzIiwgIi9vcHQvb2hwYy9wdWIvbW9kdWxlZmlsZXMiCiwgIi90cmluaXR5L3NoYXJlZC9tb2R1bGVm
getMT: nm:_ModuleTable002_, v: aWxlcy9tb2R1bGVncm91cHMiLCAiL3RyaW5pdHkvc2hhcmVkL21vZHVsZWZpbGVzL0NWLXN0YW5kYXJkIiwgIi90cmluaXR5L3NoYXJlZC9tb2R1bGVmaWxlcy9sb2NhbCIsCn0sCnN5c3RlbUJhc2VNUEFUSCA9ICIvb3B0L29ocGMvcHViL21vZHVsZWZpbGVzIiwKfQo=
MT l_new(s,restoreFn:nil){
currentMPATH: /cvmfs/software.eessi.io/host_injections/2023.06/software/linux/x86_64/amd/zen4/modules/all:/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/modules/all:/etc/scl/modulefiles:/opt/ohpc/pub/modulefiles:/trinity/shared/modulefiles/modulegroups:/trinity/shared/modulefiles/CV-standard:/trinity/shared/modulefiles/local
} MT l_new
s_mt = {
MTversion = 3,
c_rebuildTime = false,
c_shortTime = false,
depthT = {},
family = {},
mT = {},
mpathA = {
"/cvmfs/software.eessi.io/host_injections/2023.06/software/linux/x86_64/amd/zen4/modules/all", "/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/modules/all", "/etc/scl/modulefiles"
, "/opt/ohpc/pub/modulefiles", "/trinity/shared/modulefiles/modulegroups", "/trinity/shared/modulefiles/CV-standard", "/trinity/shared/modulefiles/local",
},
systemBaseMPATH = "/opt/ohpc/pub/modulefiles",
}
} MT:singleton
} FrameStk:l_new
l_usrLoad(argA, check_must_load: true){
Setting mcp to MC_Load
MainControl:load_usr(mA={PyTorch/2.1.2-foss-2023a}){
l_registerUserLoads(mA){
userName: PyTorch/2.1.2-foss-2023a
} l_registerUserLoads
MainControl:load(mA={PyTorch/2.1.2-foss-2023a}){
Hub:singleton(safe: nil){
s_hub: table: 0x55efe062a3c0, safe: true
} Hub:singleton
Hub:load(mA={PyTorch/2.1.2-foss-2023a}){
Hub:load i: 1, userName: PyTorch/2.1.2-foss-2023a
Cache:singleton(){
Cache:l_new(){
ReadLmodRC:singleton(){
} ReadLmodRC:singleton
#scDescriptT: 2
Adding: dir: /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/cache, timestamp: 1746164815
} Cache:l_new
s_cache.buildCache: nil
spiderDirT[/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/modules/all]: false
spiderDirT[/etc/scl/modulefiles]: false
spiderDirT[/opt/ohpc/pub/modulefiles]: false
spiderDirT[/trinity/shared/modulefiles/modulegroups]: false
spiderDirT[/trinity/shared/modulefiles/local]: false
} Cache:singleton
Cache:build(fast=nil){
self.buildCache: true
buildFresh: false
Cache l_readCacheFile(mpathA, spiderTFnA){
#spiderTFnA: 1
cacheFile found: /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen4/.lmod/cache/spiderT.luac_5.1
valid: true, timeDiff: 1
} Cache l_readCacheFile
Cache l_readCacheFile(mpathA, spiderTFnA){
#spiderTFnA: 1
Did not find: /home/dernatr/.cache/lmod/spiderT.x86_64_Linux.luac_5001
cacheFile found: /home/dernatr/.cache/lmod/spiderT.x86_64_Linux.lua
valid: true, timeDiff: 80899.428221941
/cvmfs/software.eessi.io/versions/2023.06/compat/linux/x86_64/usr/bin/lua5.1: ...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: bad argument #1 to 'next' (table expected, got boolean)
stack traceback:
[C]: in function 'next'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:341: in function 'l_readCacheFile'
...compat/linux/x86_64/usr/share/Lmod/libexec/Cache.lua:561: in function 'build'
...mpat/linux/x86_64/usr/share/Lmod/libexec/ModuleA.lua:685: in function 'singleton'
...compat/linux/x86_64/usr/share/Lmod/libexec/MName.lua:183: in function 'l_lazyEval'
...compat/linux/x86_64/usr/share/Lmod/libexec/MName.lua:262: in function 'sn'
...6/compat/linux/x86_64/usr/share/Lmod/libexec/Hub.lua:312: in function 'load'
.../linux/x86_64/usr/share/Lmod/libexec/MainControl.lua:1055: in function 'load'
.../linux/x86_64/usr/share/Lmod/libexec/MainControl.lua:1031: in function 'load_usr'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:550: in function 'l_usrLoad'
...pat/linux/x86_64/usr/share/Lmod/libexec/cmdfuncs.lua:578: in function 'cmd'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:514: in function 'main'
...3.06/compat/linux/x86_64/usr/share/Lmod/libexec/lmod:585: in main chunk
[C]: ?
BTW, I can ignore cache (and it works), with:
source /cvmfs/software.eessi.io/versions/2023.06/init/lmod/bash
LMOD_IGNORE_CACHE=yes module {spider,available,load,...}
Maybe another useful information is that it is not a vanilla OpenHPC distro but a StackHPC slurm appliance based upon RockyLinux with OpenHPC repositories.
@remyd1 We ran into a similar issue on our HPC-UGent systems, where the underlying cause was missing cache files for cascadelake and icelake, could that be related here?
See https://gitlab.com/eessi/support/-/issues/167 for more details.