java-smt icon indicating copy to clipboard operation
java-smt copied to clipboard

Allow for multiple different CPU architectures to be used

Open baierd opened this issue 4 years ago • 13 comments

Since we have a working MathSAT5 binary for ARM, we should incorporate the possibility to use this automatically. This means that our build systems either need to detected the architecture and only pull the binary for the used architecture, or find another solution, like for example check in JavaSMT while all binaries are available for all architectures.

baierd avatar Apr 28 '21 09:04 baierd

Pro renaming and downloading all binaries:

CPAchecker uses different systems (VCloud, Ubuntu w 64bit, Raspberry Pi etc.).
Clear and easy to understand for users.

Contra renaming:

User has increased memory consumption because of the unused binaries.

Unknown at this time:

Scaling of new binaries? How many archs will be supported? How many new files will be added per arch?

Pro Ivy-Configuration:

Easiest for user

Contra Ivy-Configuration:

Will most likely become difficult to resolve what arch needs what binary.
Could be very chaotic very fast and therefore difficult to extend/change.

baierd avatar Apr 28 '21 09:04 baierd

JavaSMT comes with SMT solvers for several architectures, including Linux (x86_64) and Windows (x86_64). With the latest MathSAT5 version, we also could add support for Raspi4 (armv8/aarch64).

Problem

The naming of libmathsat5j.so for Linux is not sufficient, because both architectures, x86_64 and armv8, use the same naming scheme. It is not possible to ship one application providing the possibility to run on both architectures. In case of CPAchecker, the native solver binaries are stored in lib/runtime/ and there is currently no possibility to differ between different architectures.

Up to now, the problem was not critical, because the operating system was different and we used the file ending (so/dll/dylib) for different files. This problem also affects other native solvers, such as Z3, maybe not now, but in the future, as soon as they support armv8 or other upcoming CPU architectures.

There are two main ideas how to solve this:

Solution 1

We could encode the intended architecture in the filename of the library and provide a libmathsat5j-armv8.so, for example.

Pro: This would be relatively simple to implement.

Cons: All binaries for all supported architectures are installed at the user's system, e.g., also dll-files would be available on a Raspi. This might be only a few MB for JavaSMT, but increases with every new supported architecture.

Solution 2

We could provide IVY configurations for the JavaSMT library, depending on the architecture. The user can then choose which version with which SMT solver should be installed. Currently we only provide configurations depending on solvers, such as runtime-mathsat and runtime-smtinterpol. We could extend the list with all usefull combinations from -WINDOWS64, -LINUXx64_86, -LINUXarmv8, etc.

Pro: The user has full control over the installed components, i.e., can combine solver and architecture.

Cons: The IVY configurations are perhaps ugly to maintain. For CPAchecker, this causes additional effort, if we want to have one version of CPAchecker (provided by a user/developer/student) running on all workers in the VCloud, including x86_64 and armv8 in parallel. Additionally, CPAchecker would need to keep the binaries for architectures in distinct directories.

Solution 1 and 2 can also be partially combined, such that we encode the architecture in the filename and we provide an IVY configuration per solver and architecture plus meta configurations per solver (all architectures).

kfriedberger avatar Apr 29 '21 13:04 kfriedberger

The desire to have one application in one directory and be able to execute it directly from this directory on different architectures seems like a niche use case that is not common and is also not supported by most other applications anyway. For example, for every normal binary application, one cannot do this. So while it might be nice to support this if possible without too much effort, I would suggest to not give major weight to it when deciding on a solution, if it would produce disadvantages for the standard use cases. The only requirement that would follow from this is that we provide an opportunity for the user to overwrite any (automatically detected) architecture, such that one can download/build/package an application for a different architecture than the current one. This can be done with an ant property for example.

Furthermore, our way of loading libraries does support (and always has) multiple architectures quite fine, even if one wants to have all binaries for all possible architectures installed at the same time: Before looking for a library file in the same directory as the JAR for Sosy-Lab Common, it looks in a directory named native/$arch-$os (the code in NativeLibraries does not yet support ARM, but of course this can be trivially added).

So the problem is not in supporting this in JavaSMT, nor is there a problem for users who manually put the files in the necessary location, the problem is only because we want to provide a ready-to-use Ivy-based solution that does the right thing automatically for users.

This means for example that there are more possible solutions, like making the build system putting the files in the correct location where the code already looks for them.

Regarding the solutions proposed above:

I would not recommend to invent new non-standard naming schemes for libraries. This would be inconsistent with older JavaSMT versions and with other applications, probably forces other JavaSMT users that do not use Ivy to also rename everything, and makes it more difficult to use JavaSMT with system-wide installed libraries (which is currently possible).

The Linux standard is also to put different binaries for different architectures into different directories, for example when using a mixture of 32-bit and 64-bit packages.

I don't fully understand what is proposed regarding the addition of Ivy configurations, but I think it needs to be clear that downloading and building JavaSMT-based applications should work out of the box by just running the build tool, e.g., ant, without having to manually edit Ivy configurations to select the desired architecture.

But to discuss this further I think it would be good to have more concrete proposals of how the solutions would look like and work in practice.

PhilippWendler avatar Apr 29 '21 14:04 PhilippWendler

Use Case 1:

A developer on Windows uses CPAchecker and then want to use the VCloud for benchmarking. He will be happy, if no changes are required. The same might apply for Raspis, as soon as there are more of them. Thus, I would not see this as a niche in our group with lots of students.

Use Case 2:

JavaSMT might want to publish the MathSAT bindings for ARMv8 into Ivy. There is no architecture flag for libraries. Thus commiting it is not possible. How do we want to solve this? Renaming would help here. The renaming would not appear for often used architectures (currently: x86_64), but only for new ones. We can also just rename the dependency itself, such that there is a configuration depending on runtime-mathsat and another configuration depending on runtime-armv8-mathsat. Installing both dependencies in CPAchecker will then not work, but the user could choose there.

In case of renaming, of course, the loading process needs to look for the right library, e.g., by directory and by name. For CPAchecker, the decision was made years ago to put all transitive dependency libraries into lib/java/runtime/, which worked nicely since. But is that decision future-proof?

kfriedberger avatar Apr 29 '21 18:04 kfriedberger

A developer on Windows uses CPAchecker and then want to use the VCloud for benchmarking. He will be happy, if no changes are required. The same might apply for Raspis, as soon as there are more of them. Thus, I would not see this as a niche in our group with lots of students.

  • Most of these developers will use the VerifierCloud service that builds CPAchecker for them.
  • As mentioned, no other application will support this use case as well. So as soon as they want to compare CPAchecker against another tool, they have to be prepared to work with such a situation anyway.
  • I did say that working with e.g. the Linux binaries on a Windows system needs to be possible. But it could require a different directory, just like when cross-compiling.

The renaming would not appear for often used architectures (currently: x86_64), but only for new ones.

This would make the naming scheme inconsistent and confusing.

There is no architecture flag for libraries.

Are you sure? First, there is the type attribute for artifacts. Second, according to the documentation it seems possible to define arbitrary additional attributes.

Installing both dependencies in CPAchecker will then not work, but the user could choose there.

As mentioned, I would not consider it acceptable for CPAchecker that users on certain architectures have to manually fiddle around with some config file just in order to make it work. Whatever way is chosen, the correct architecture needs to be determined automatically (with possibility of manual override).

For CPAchecker, the decision was made years ago to put all transitive dependency libraries into lib/runtime/, which worked nicely since.

No, this decision was never made. CPAchecker has always searched for its binary libraries in lib/native/$arch-$os/. Several years ago, an additional fallback was added to also look for binary libraries in the same directory where JAR dependencies are.


Important note: To me it seems that there are two separate questions here: How to publish the artifacts in Ivy and how to put them on the user's disk. So far in the solutions written above this seems completely mixed, but it should not be.

PhilippWendler avatar Apr 30 '21 07:04 PhilippWendler

The issue contains the publication via Ivy and also the retrieval via Ivy, but those topics are related and need both be considered. For the publication via Ivy, the hint on extra attributes 1 is good and will considered. So that part of the issue seems already to be solved in theory.

For JavaSMT (developement), the libraries are currently located in lib/java/runtime-$solver/, without mentioning $arch and $os. This, there might be identical filenames in the same directory. Should we change this to lib/java/runtime-$solver-$arch-$os/ to be unique? This change would not effect the user and can be automatically done by the build process.

For CPAchecker, all libraries are currently automatically stored under lib/java/runtime. Should we change the build process to somehow store the native binaries in lib/native/$arch-$os/? Would that be the better way? This would require that the build process of CPAchecker knows about JavaSMT, native libraries and architecture. Would the build process then download libraries for all architectures by default or choose automatically which ones are required, depending on the current system?

kfriedberger avatar Apr 30 '21 08:04 kfriedberger

For CPAchecker, all libraries are currently automatically stored under lib/java/runtime. Should we change the build process to somehow store the native binaries in lib/native/$arch-$os/? Would that be the better way? This would require that the build process of CPAchecker knows about JavaSMT, native libraries and architecture.

As mentioned, CPAchecker already uses lib/native, just not for Ivy. And CPAchecker's build process already knows about JavaSMT, otherwise it could not download it.

I would see the question more as: How difficult is this and how much complexity would such a solution force JavaSMT users to add to their build system?

Would the build process then download libraries for all architectures by default or choose automatically which ones are required, depending on the current system?

One could provide both options? Or even have something like ant install-contrib where a one-time command enables downloading of the respective artifacts for the future.

PhilippWendler avatar Apr 30 '21 13:04 PhilippWendler

CPAchecker does not use lib/native for transitive dependencies like MathSAT5 or Z3, but only direct dependencies of CPAchecker itself or user-provided libraries. CPAchecker only knows about JavaSMT, not about its dependencies. Thus, all transitive dependencies are stored in lib/java/runtime, including solver binaries.

When it comes to complexity, the simplest solution (from a user's point of view) is the encoding of the architecture in the name of the library, then let JavaSMT internally try different namings (in all the directories provided by NativeLibraries), and store all transitive dependencies in lib/java/runtime (as it is done now).

Currently, Ivy does not distinguish between different architectures. One possibility is a direct check for the available system and then act correspondingly and download ony a subset of the solver binaries. Do we want to have such code in the build-scripts? Otherwise Ivy could simply download all of them and store them at correct collisionfree locations.

kfriedberger avatar Apr 30 '21 13:04 kfriedberger

One possibility is a direct check for the available system and then act correspondingly and download ony a subset of the solver binaries. Do we want to have such code in the build-scripts?

Depends on how complex that would be, right?

PhilippWendler avatar Apr 30 '21 13:04 PhilippWendler

Without testing it, I could imagine something like: set a property $arch and then use it as part of the dependency management. Should not be that complex.

This however, might have consequences for other build steps, e.g., ant dist should provide a Windows and Linux (and ARM?) release. And for benchmarking in the VCloud, one needs to override this property with the targeted worker architecture.

kfriedberger avatar Apr 30 '21 14:04 kfriedberger

Hi guys, I just ran into the issue of running Z3 via JavaSMT on a new Mac (M2 ARM processor). We use maven to get the Z3 dependencies, but this ends up getting only the x86_64 binaries:

<artifactItem>
  <groupId>org.sosy-lab</groupId>
  <artifactId>javasmt-solver-z3</artifactId>
  <type>dylib</type>
  <classifier>libz3</classifier>
  <destFileName>libz3.dylib</destFileName>
</artifactItem>

Getting the correct binaries directly from Z3's repo, and replacing the one's downloaded via maven works: JavaSMT with Z3 runs fine. maven cleanup will re-download the x86 binaries though, so that solution is not a permanent one. Can I somehow change the maven config to get the ARM binaries? And if not, are there reasonable workarounds? I thought of placing the correct binaries in Library/Java/Extension (a default location for loading binaries) and making sure that this path is checked first for the binaries. I would like to avoid this, if there is a better and more automatic way of doing it though. Any ideas?

ThomasHaas avatar Apr 19 '23 15:04 ThomasHaas

Hi @ThomasHaas , our knowledge about Macs is limited, because we do not have such advanced technology :-) Well, for your case, there are several options:

  • we can provide more releases of Z3, somehow matching the architecture. (We would need to take a look if everything is also named "dylib" for ARM-based MacOS and how we can name/identify the artifact.)
  • you can remove the Maven dependency for Z3, provide your wanted Z3 files somewhere else (in Library/Java/Extension), and then use a LibraryLoader to access it. See here for documentation: SolverContextFactory. A LibraryLoader is a function that takes a string argument (which is the library name to be loaded) and loads one or more libraries.

kfriedberger avatar Apr 19 '23 19:04 kfriedberger

I did the following workaround for now:

  • I added a maven profile for MacOS-aarch64 that skips downloading the Z3 x86 binaries.
  • I manually downloaded and placed the correct binaries in Library/Java/Extension. LibraryLoader is not necessary, because it is a default location.

we can provide more releases of Z3, somehow matching the architecture. (We would need to take a look if everything is also named "dylib" for ARM-based MacOS and how we can name/identify the artifact.)

The Z3 binaries for all architectures have identical names. If there were alternative arm versions in the maven repository, we could easily download the correct one (and even rename it if needed) using maven profiles.

ThomasHaas avatar Apr 21 '23 10:04 ThomasHaas