arbor icon indicating copy to clipboard operation
arbor copied to clipboard

Add NO_DEFAULT_PATH to find_program (modcc)

Open llandsmeer opened this issue 3 years ago • 4 comments

When installing arbor to a custom location next to a arbor install at a default location, arbor-build-catalogue will still use the modcc from the default location.

Eg:

# install default location
pip3 install arbor

# install custom location
git clone --depth 1 --recursive --branch=SDE 'https://github.com/boeschf/arbor'
cd arbor ; mkdir build ; cd build
cmake -DCMAKE_INSTALL_PREFIX=$(realpath ~/prefix) -DARB_USE_BUNDLED_LIBS=ON -DARB_WITH_PYTHON=ON ..
make -j 12 && make install

Now, calling ~/prefix/bin/arbor-build-catalogue on a mechanism containing WHITE_NOISE source will fail because the wrong modcc is refered. Arbor-build-catalogue actually finds the right modcc in python, but then CMake ends up finding the wrong path

This is fixed by adding a NO_DEFAULT_PATH option to find_program

llandsmeer avatar Sep 28 '22 10:09 llandsmeer

It would be nice if there was some kind of 'SEARCH THE GIVEN PATHS FIRST THEN THE DEFAULT PATHS' options to prevent that, but I can't find it.

Maybe, we can have something like

find_program(modcc NAMES modcc PATHS {exec_path} NO_DEFAULT_PATH)

if (NOT modcc)
    find_program(modcc NAMES modcc)
endif()

?

llandsmeer avatar Oct 07 '22 13:10 llandsmeer

That sounds like a good idea!

thorstenhater avatar Oct 10 '22 06:10 thorstenhater

So I just tried the fallback suggestion. However, the python code already makes it impossible for a-b-c to ever hit the fallback case, due to this code:

for path in [
    exec_path / "modcc",
    data_path / "BuildModules.cmake",
    pack_path / "arbor-config.cmake",
]:
    if not path.exists():
        print(f"Could not find required tool: {path}. Please check your installation.")
        exit(-1)

So if we want a fallback to DEFAULT path modcc, we should remove this python code. Which does not make much sense in my opinion

llandsmeer avatar Oct 11 '22 11:10 llandsmeer

Sure, go for it, but note that after #1958 the exec_path is sourced from arbor.config()

thorstenhater avatar Oct 18 '22 07:10 thorstenhater

@llandsmeer could you merge master and

  1. See if the problem persists
  2. If yes, make this work with the new a-b-c

Thanks!

thorstenhater avatar Oct 27 '22 09:10 thorstenhater

Merged. Tried:

mkdir build-abc
cd build-abc
cmake .. -DCMAKE_INSTALL_PREFIX=$(realpath prefix) -DARB_USE_BUNDLED_LIBS=ON -DARB_WITH_PYTHON=ON
make -j 8
make install
cd prefix/bin
cp ../../../mechanisms/default . -r
PYTHONPATH=../lib/python3.10/site-packages ./arbor-build-catalogue default default

(I found it confusing that I could build a-b-c without the python module, but that a-b-c also really doesn't work without the python module. And by default it imports arbor from the wrong location.. I know, these should have been PR comments)

Yet, the problem still persist, the wrong modcc is called leading to a crash (my system modcc is pretty old)

Fix is still adding the NO_DEFAULT_PATH option

find_program(modcc NAMES modcc PATHS {bindir} NO_DEFAULT_PATH)

llandsmeer avatar Oct 28 '22 19:10 llandsmeer

I found it confusing that I could build a-b-c without the python module, but that a-b-c also really doesn't work without the python module.

Good point. Adding defaults to the defaults might fix it, or requiring Python but be built-in.

And by default it imports arbor from the wrong location..

How so?

I know, these should have been PR comments)

Hindsight is 20/20, I guess, but better notice it now, rather than after the release.

thorstenhater avatar Oct 29 '22 18:10 thorstenhater

And by default it imports arbor from the wrong location..

How so?

import arbor will just import the system default (probably pip installed) arbor. If you used a custom prefix, you'll import the wrong arbor. Probably something like

import os, sys, site
sys.path.insert(0, os.path.abspath(os.path.join(__file__, '../..' + site.USER_SITE.removeprefix(site.USER_BASE))))
import arbor

solves that. It will still fall back to system-arbor if you didn't install the python package / moved a-b-c to a different location.

llandsmeer avatar Nov 01 '22 14:11 llandsmeer

It'll use what ever is in scope. I was thinking that might the behaviour 90% of users expect. Shall we add a 'look here' parameter?

thorstenhater avatar Nov 02 '22 12:11 thorstenhater

pip has its own rules for resolving packages (and package locations). We have learned that trying to outsmart pip leads to unintended consequences (for which we're now on the hook).

Using a custom prefix, and indeed using CMAKE instead of pip, is in principle at the user's peril. By default, a-b-c looks for modcc in the same prefix, which is probably what the user wants, and certainly if they installed through pip. Finding out whether or not a-b-c was installed to a custom prefix

  • The above requires modcc is indeed built and installed. Is that a guarantee? If not, add.
  • The above does not mean advanced use could not use a configurable modcc. Is that what you need (in earlier discussion I understood that, but memory...). If so, it can just be added as a parameter to CMAKE, right?
  • a-b-c the script has no dependency on the Python Arbor module. Should it? The script builds a library object that can be loaded by Arbor, and it ingests .mod, .cpp or .cu, correct? Where does the Python dep come in?

brenthuisman avatar Nov 02 '22 12:11 brenthuisman

It'll use what ever is in scope. I was thinking that might the behaviour 90% of users expect.

It's not the behavior I expect when I install arbor to a different prefix. Then I expect it to use the modcc from that prefix without having to think about python import paths. Luckily it fails loudly on my system because the system arbor has no 'prefix' attribute in the config so it's clear what's happening, but at some later point it will just silently start using the wrong modcc. If I'm alone in expecting this behaviour we can just keep things this way of course

By default, a-b-c looks for modcc in the same prefix

No this is what changed in the a-b-c PR. It now uses import arbor to get a config of where everything is installed

And in both the old and new version, still the wrong modcc will be called even if it finds the right prefix because of the missing NO_DEFAULT_PATH

https://github.com/arbor-sim/arbor/blob/master/scripts/arbor-build-catalogue#L3

import arbor as A
config = A.config()
prefix = Path(config['prefix'])

llandsmeer avatar Nov 02 '22 13:11 llandsmeer

Whelp, I was looking at a pre #1958 a-b-c-, sorry! You are right: import arbor can resolve to an entirely unrelated build. What's the reason this was switch from picking modcc in the a-b-c script dir? In which case is that the wrong assumption?

brenthuisman avatar Nov 02 '22 14:11 brenthuisman

See description of #1958, but if needed:

Before we splatted in the paths during configure time. That'll break for some files (eg CMake scripts) under skbuild. We never actually used a-b-c/../modcc; also those might be in different prefixes. Further, we cannot move a-b-c then, if its location is tied to that of modcc.

So, now, we peek into the environment the user has active. Assumption being that this is what they want to use. If a custom prefix is used and not set up in PYTHONPATH there's all kinds of disaster waiting to happen, no?

thorstenhater avatar Nov 02 '22 16:11 thorstenhater

The current situation will also lead to 'disasters', the problem being that these are mostly silent (only noticable if modcc gained new features in the prefix). Maybe we should just print a warning if the python and a-b-c prefix don't match but nothing more. I don't think I've ever updated PYTHONPATH to point to arbor, I always modify sys.path

llandsmeer avatar Nov 02 '22 17:11 llandsmeer

You should probably be doing neither and use venvs instead. Just saying. :P

Question in all honesty though: How do you expect binaries/tool are discovered if you put them in some off-the-path (!) location? So, assume you have

  • /usr/local/bin/modcc
  • ~/my-projects/arbor/bin/modcc

Usually, we would pick /usr/local/bin, unless you added your custom prefix to PATH. Thus I'd conclude that this should be the order of preference

  • Use a venv for custom installs
  • Put modcc/arbor in your default location.
  • Use PYTHONPATH
  • Use sys.path

This seems also to be the opinion of the wider public.

thorstenhater avatar Nov 02 '22 17:11 thorstenhater

How about this: the script gets a user prompt for the modcc path, pre-filled with what it gathers from arbor.config().

Besides fixing a-b-c, shall we fix ARB_MODCC in this PR?

brenthuisman avatar Nov 03 '22 07:11 brenthuisman

How about this: the script gets a user prompt for the modcc path, pre-filled with what it gathers from arbor.config().

a-b-c is called automatically in both my models and nmlcc output so that wouldn't work

-> In that case the most 'safe' way would be a python method arbor.a_b_c(args) that thus by default uses the right arbor module (because that was explicitly imported by the user) and thereby finds the right pythonpath/prefix/a-b-c/modcc and might even abstract away the recompile-if-changed loop that all these scripts have.

shall we fix ARB_MODCC in this PR?

Sounds good, but could be a different PR as well (it's different files)

llandsmeer avatar Nov 03 '22 09:11 llandsmeer

for arg in vars(args):
    print arg, getattr(args, arg)

Will print all args and the value they were set to; in general great to make configuration visible and loggable.

brenthuisman avatar Nov 03 '22 09:11 brenthuisman

In the future (next release?) it might be a good idea to move a-b-c into the python module. It is already dependent on it. Then a user could directly call arbor.arbor_build_catalogue.main() without having to worry about importing the wrong arbor. Can write a PR for that later.

llandsmeer avatar Nov 03 '22 10:11 llandsmeer

In the future (next release?) it might be a good idea to move a-b-c into the python module. It is already dependent on it. Then a user could directly call arbor.arbor_build_catalogue.main() without having to worry about importing the wrong arbor. Can write a PR for that later.

llandsmeer avatar Nov 03 '22 10:11 llandsmeer

@llandsmeer why do you add this printing step? Did you take a look at --help? It reports all the active defaults.

thorstenhater avatar Nov 03 '22 12:11 thorstenhater

for arg in vars(args):
    print arg, getattr(args, arg)

Will print all args and the value they were set to; in general great to make configuration visible and loggable.

How about passing --help and seeing the active default?

thorstenhater avatar Nov 03 '22 12:11 thorstenhater

@thorstenhater it does not look like --help shows settings passed by the user, only built-in defaults. Also, I like to have parameters printed without specifically requesting it. Makes these kinds of things easier to log/catch.

brenthuisman avatar Nov 04 '22 10:11 brenthuisman

Yes, I do expect people to remember which parameters they passed. Tricky are the implicitly set defaults, those are shown when you ask for help. BTW, those are not builtin but sourced from the arbor module in scope.

I suggest to make tools as little spammy as possible. If the users asks for it (there's -v for that precise reason), different story. The reasoning that it's 'easier to catch' is a fallacy, it's just hoping that a user picks up a subtle clue in an output they've seen (and ignored) 100x before. I'd get man at a tool that regurgitates all my explicit parameters back to me every single instead of just doing its job.

I had a chat w/ @llandsmeer and we decided on a different policy

  1. do the right thing, ie use Arbor in scope, but
  2. version modcc, arbor, and a-b-c
  3. if versions are incompatible, then tell the user and bail.
  4. Make a-b-c a library, or more likely integrate into modcc.
  5. Fix @llandsmeer 'special' setup ;)

thorstenhater avatar Nov 04 '22 10:11 thorstenhater

That said, it's also pretty obvious we'll need another pass on this to polish it up to a level I am happy with.

  1. ~All path arguments are tacked onto prefix unless the argument is an absolute path, then use that verbatim~ TIL that's what os.path.join does. Nice.
  2. Specialise paths to what they are actually used for, eg binary_path -> modcc, and so on

For its final form 3. Roll a-b-c into modcc 4. make that spit out the buildmodule.cmake in its standalone version (gets rid of the pathing/installation/versioning of this annoying cmake file) 5. Although, what I'd truly want here would be to get rid of the dependency on CMake when building catalogues. We should be ok here with just calling cxx -I... -L.... One less weird indirection just to find a lib dir.

thorstenhater avatar Nov 04 '22 11:11 thorstenhater

BTW, those are not builtin but sourced from the arbor module in scope.

And not any parameters passed, which gave rise to this issue (modcc). That should definitely be in the solution then.

Yes, I do expect people to remember which parameters they passed. [...] I'd get man at a tool that regurgitates all my explicit parameters back to me every single instead of just doing its job.

I hear seasoned dev opinions ;) That's not our only target. Placing this output under -v is fine.

brenthuisman avatar Nov 07 '22 11:11 brenthuisman

And not any parameters passed, which gave rise to this issue (modcc). That should definitely be in the solution then.

What? No. The defaults are pulled from arbor.__config__, if you pass an argument, that takes precedence.

thorstenhater avatar Nov 07 '22 11:11 thorstenhater

Right! My point was that any flags passed overriding those defaults should be printed as such. The user should be able to know what the eventual config is.

brenthuisman avatar Nov 07 '22 11:11 brenthuisman

How's this PR hanging?

I just ran into another issue that might be addressed by a change here: if a-b-c and arbor become more tied, it would be awfully practical if users would not have to shell out to build mechanisms. i.e. arbor.mech_cat_from_dir(mydir). This would help in environments where users are confined to Jupiter interfaces. Also, certain environments seem not to propagate $PATH, which makes usage of a-b-c very impractical: https://gitlab.ebrains.eu/technical-coordination/project-internal/devops/platform/ebrains-spack-builds/-/issues/14

brenthuisman avatar Nov 23 '22 11:11 brenthuisman

Yeah, that's exactly one of the steps we outlined above: build-catalogue becomes part of the arbor module and can be used from with Python and/or C++.

thorstenhater avatar Nov 23 '22 11:11 thorstenhater