Add NO_DEFAULT_PATH to find_program (modcc)
When installing arbor to a custom location next to a arbor install at a default location, arbor-build-catalogue will still use the modcc from the default location.
Eg:
# install default location
pip3 install arbor
# install custom location
git clone --depth 1 --recursive --branch=SDE 'https://github.com/boeschf/arbor'
cd arbor ; mkdir build ; cd build
cmake -DCMAKE_INSTALL_PREFIX=$(realpath ~/prefix) -DARB_USE_BUNDLED_LIBS=ON -DARB_WITH_PYTHON=ON ..
make -j 12 && make install
Now, calling ~/prefix/bin/arbor-build-catalogue on a mechanism containing WHITE_NOISE source will fail because the wrong modcc is refered. Arbor-build-catalogue actually finds the right modcc in python, but then CMake ends up finding the wrong path
This is fixed by adding a NO_DEFAULT_PATH option to find_program
It would be nice if there was some kind of 'SEARCH THE GIVEN PATHS FIRST THEN THE DEFAULT PATHS' options to prevent that, but I can't find it.
Maybe, we can have something like
find_program(modcc NAMES modcc PATHS {exec_path} NO_DEFAULT_PATH)
if (NOT modcc)
find_program(modcc NAMES modcc)
endif()
?
That sounds like a good idea!
So I just tried the fallback suggestion. However, the python code already makes it impossible for a-b-c to ever hit the fallback case, due to this code:
for path in [
exec_path / "modcc",
data_path / "BuildModules.cmake",
pack_path / "arbor-config.cmake",
]:
if not path.exists():
print(f"Could not find required tool: {path}. Please check your installation.")
exit(-1)
So if we want a fallback to DEFAULT path modcc, we should remove this python code. Which does not make much sense in my opinion
Sure, go for it, but note that after #1958 the exec_path is sourced from arbor.config()
@llandsmeer could you merge master and
- See if the problem persists
- If yes, make this work with the new a-b-c
Thanks!
Merged. Tried:
mkdir build-abc
cd build-abc
cmake .. -DCMAKE_INSTALL_PREFIX=$(realpath prefix) -DARB_USE_BUNDLED_LIBS=ON -DARB_WITH_PYTHON=ON
make -j 8
make install
cd prefix/bin
cp ../../../mechanisms/default . -r
PYTHONPATH=../lib/python3.10/site-packages ./arbor-build-catalogue default default
(I found it confusing that I could build a-b-c without the python module, but that a-b-c also really doesn't work without the python module. And by default it imports arbor from the wrong location.. I know, these should have been PR comments)
Yet, the problem still persist, the wrong modcc is called leading to a crash (my system modcc is pretty old)
Fix is still adding the NO_DEFAULT_PATH option
find_program(modcc NAMES modcc PATHS {bindir} NO_DEFAULT_PATH)
I found it confusing that I could build a-b-c without the python module, but that a-b-c also really doesn't work without the python module.
Good point. Adding defaults to the defaults might fix it, or requiring Python but be built-in.
And by default it imports arbor from the wrong location..
How so?
I know, these should have been PR comments)
Hindsight is 20/20, I guess, but better notice it now, rather than after the release.
And by default it imports arbor from the wrong location..
How so?
import arbor will just import the system default (probably pip installed) arbor. If you used a custom prefix, you'll import the wrong arbor. Probably something like
import os, sys, site
sys.path.insert(0, os.path.abspath(os.path.join(__file__, '../..' + site.USER_SITE.removeprefix(site.USER_BASE))))
import arbor
solves that. It will still fall back to system-arbor if you didn't install the python package / moved a-b-c to a different location.
It'll use what ever is in scope. I was thinking that might the behaviour 90% of users expect. Shall we add a 'look here' parameter?
pip has its own rules for resolving packages (and package locations). We have learned that trying to outsmart pip leads to unintended consequences (for which we're now on the hook).
Using a custom prefix, and indeed using CMAKE instead of pip, is in principle at the user's peril. By default, a-b-c looks for modcc in the same prefix, which is probably what the user wants, and certainly if they installed through pip. Finding out whether or not a-b-c was installed to a custom prefix
- The above requires
modccis indeed built and installed. Is that a guarantee? If not, add. - The above does not mean advanced use could not use a configurable modcc. Is that what you need (in earlier discussion I understood that, but memory...). If so, it can just be added as a parameter to CMAKE, right?
-
a-b-cthe script has no dependency on the Python Arbor module. Should it? The script builds a library object that can be loaded by Arbor, and it ingests.mod,.cppor.cu, correct? Where does the Python dep come in?
It'll use what ever is in scope. I was thinking that might the behaviour 90% of users expect.
It's not the behavior I expect when I install arbor to a different prefix. Then I expect it to use the modcc from that prefix without having to think about python import paths. Luckily it fails loudly on my system because the system arbor has no 'prefix' attribute in the config so it's clear what's happening, but at some later point it will just silently start using the wrong modcc. If I'm alone in expecting this behaviour we can just keep things this way of course
By default, a-b-c looks for modcc in the same prefix
No this is what changed in the a-b-c PR. It now uses import arbor to get a config of where everything is installed
And in both the old and new version, still the wrong modcc will be called even if it finds the right prefix because of the missing NO_DEFAULT_PATH
https://github.com/arbor-sim/arbor/blob/master/scripts/arbor-build-catalogue#L3
import arbor as A
config = A.config()
prefix = Path(config['prefix'])
Whelp, I was looking at a pre #1958 a-b-c-, sorry! You are right: import arbor can resolve to an entirely unrelated build. What's the reason this was switch from picking modcc in the a-b-c script dir? In which case is that the wrong assumption?
See description of #1958, but if needed:
Before we splatted in the paths during configure time. That'll break for some files (eg CMake scripts) under skbuild. We never actually used a-b-c/../modcc; also those might be in different prefixes. Further, we cannot move a-b-c then, if its location is tied to that of modcc.
So, now, we peek into the environment the user has active. Assumption being that this is what they want to use.
If a custom prefix is used and not set up in PYTHONPATH there's all kinds of disaster waiting to happen, no?
The current situation will also lead to 'disasters', the problem being that these are mostly silent (only noticable if modcc gained new features in the prefix). Maybe we should just print a warning if the python and a-b-c prefix don't match but nothing more. I don't think I've ever updated PYTHONPATH to point to arbor, I always modify sys.path
You should probably be doing neither and use venvs instead. Just saying. :P
Question in all honesty though: How do you expect binaries/tool are discovered if you put them in some off-the-path (!) location? So, assume you have
- /usr/local/bin/modcc
- ~/my-projects/arbor/bin/modcc
Usually, we would pick /usr/local/bin, unless you added your custom prefix to PATH. Thus I'd conclude that this should be the order of preference
- Use a venv for custom installs
- Put modcc/arbor in your default location.
- Use PYTHONPATH
- Use sys.path
This seems also to be the opinion of the wider public.
How about this: the script gets a user prompt for the modcc path, pre-filled with what it gathers from arbor.config().
Besides fixing a-b-c, shall we fix ARB_MODCC in this PR?
How about this: the script gets a user prompt for the modcc path, pre-filled with what it gathers from arbor.config().
a-b-c is called automatically in both my models and nmlcc output so that wouldn't work
-> In that case the most 'safe' way would be a python method arbor.a_b_c(args) that thus by default uses the right arbor module (because that was explicitly imported by the user) and thereby finds the right pythonpath/prefix/a-b-c/modcc and might even abstract away the recompile-if-changed loop that all these scripts have.
shall we fix ARB_MODCC in this PR?
Sounds good, but could be a different PR as well (it's different files)
for arg in vars(args):
print arg, getattr(args, arg)
Will print all args and the value they were set to; in general great to make configuration visible and loggable.
In the future (next release?) it might be a good idea to move a-b-c into the python module. It is already dependent on it. Then a user could directly call arbor.arbor_build_catalogue.main() without having to worry about importing the wrong arbor. Can write a PR for that later.
In the future (next release?) it might be a good idea to move a-b-c into the python module. It is already dependent on it. Then a user could directly call arbor.arbor_build_catalogue.main() without having to worry about importing the wrong arbor. Can write a PR for that later.
@llandsmeer why do you add this printing step? Did you take a look at --help? It reports all the active defaults.
for arg in vars(args): print arg, getattr(args, arg)Will print all args and the value they were set to; in general great to make configuration visible and loggable.
How about passing --help and seeing the active default?
@thorstenhater it does not look like --help shows settings passed by the user, only built-in defaults. Also, I like to have parameters printed without specifically requesting it. Makes these kinds of things easier to log/catch.
Yes, I do expect people to remember which parameters they passed. Tricky are the implicitly set defaults, those are shown when you ask for help. BTW, those are not builtin but sourced from the arbor module in scope.
I suggest to make tools as little spammy as possible. If the users asks for it (there's -v for that precise reason), different story. The reasoning that it's 'easier to catch' is a fallacy, it's just hoping that a user picks up a subtle clue in an output they've seen (and ignored) 100x before. I'd get man at a tool that regurgitates all my explicit parameters back to me every single instead of just doing its job.
I had a chat w/ @llandsmeer and we decided on a different policy
- do the right thing, ie use Arbor in scope, but
- version modcc, arbor, and a-b-c
- if versions are incompatible, then tell the user and bail.
- Make a-b-c a library, or more likely integrate into modcc.
- Fix @llandsmeer 'special' setup ;)
That said, it's also pretty obvious we'll need another pass on this to polish it up to a level I am happy with.
- ~All path arguments are tacked onto
prefixunless the argument is an absolute path, then use that verbatim~ TIL that's whatos.path.joindoes. Nice. - Specialise paths to what they are actually used for, eg binary_path -> modcc, and so on
For its final form
3. Roll a-b-c into modcc
4. make that spit out the buildmodule.cmake in its standalone version (gets rid of the pathing/installation/versioning of this annoying cmake file)
5. Although, what I'd truly want here would be to get rid of the dependency on CMake when building catalogues. We should be ok here with just calling cxx -I... -L.... One less weird indirection just to find a lib dir.
BTW, those are not builtin but sourced from the arbor module in scope.
And not any parameters passed, which gave rise to this issue (modcc). That should definitely be in the solution then.
Yes, I do expect people to remember which parameters they passed. [...] I'd get man at a tool that regurgitates all my explicit parameters back to me every single instead of just doing its job.
I hear seasoned dev opinions ;) That's not our only target. Placing this output under -v is fine.
And not any parameters passed, which gave rise to this issue (modcc). That should definitely be in the solution then.
What? No. The defaults are pulled from arbor.__config__, if you pass an argument, that takes precedence.
Right! My point was that any flags passed overriding those defaults should be printed as such. The user should be able to know what the eventual config is.
How's this PR hanging?
I just ran into another issue that might be addressed by a change here: if a-b-c and arbor become more tied, it would be awfully practical if users would not have to shell out to build mechanisms. i.e. arbor.mech_cat_from_dir(mydir). This would help in environments where users are confined to Jupiter interfaces. Also, certain environments seem not to propagate $PATH, which makes usage of a-b-c very impractical: https://gitlab.ebrains.eu/technical-coordination/project-internal/devops/platform/ebrains-spack-builds/-/issues/14
Yeah, that's exactly one of the steps we outlined above: build-catalogue becomes part of the arbor module and can be used from with Python and/or C++.