cookie icon indicating copy to clipboard operation
cookie copied to clipboard

Problems following the "Packaging compiled projects" document

Open pfmoore opened this issue 2 months ago • 13 comments

I've just been pointed to this document, so I thought I'd give it a try. I'm rusty when it comes to writing C extensions for Python, but I'm familiar with packaging for Python modules, so I wanted to test out how easy it was to get a working wheel with newer build backends.

I hit the following problems:

  1. (Minor) The supplied pyproject.toml doesn't specify a version, so the build fails.
  2. (Minor) The documentation doesn't mention that you need a README.md file and a LICENSE file.
  3. The document doesn't explain that you'll be building a module called package._core. It's possible to find out that the module will be called _core by looking at the source (or in my case the other way round - once I worked out that I needed to import _core, I could see where that name was used in the code). But I don't see how to know what to change if I want to call my import module (as opposed to my distribution) something other than package. Doing a search for package in the source suggests that the project name is used for the import package name, which is not an unreasonable default, but it's not immediately obvious.
  4. When I tried the meson backend, the build failed with an error "Dist currently only works with Git or Mercurial repos". I haven't put my project under VCS, as it's just a test. A note clearly saying that meson needs you to put your project under VCS (and why - I got into a mess trying to patch over the error[^1] and ending up with critical files missing from the sdist) would have helped a lot here.
  5. Most importantly, the instructions don't say that your C compiler needs to be on your PATH. That may seem obvious, but for someone used to distutils (and setuptools), the expected behaviour is to automatically locate an installed copy of Visual Studio. And it's not unreasonable for someone to expect that when they install Visual C, it will be ready for use in a Python extension. This was particularly bad in my case, because the build backend found a copy of gcc that I'd installed ages ago, and used that rather than failing with an error saying "Can't find C compiler". The resulting build worked, but failed on import as it didn't have access to the C++ runtime DLLs. That's a particularly difficult failure to debug (I was lucky, I've seen it before a number of times, so the symptoms were familiar to me), so it would be really useful to have an explicit note in the document about it.

But overall, the process was very smooth. The guide is clear and easy to follow, and any questions I have seem to be mostly around details of the backends themselves - which isn't what this guide is trying to cover. I feel like there could be a little more coverage of the differences between scikit-build-core and meson (most notably that meson uses VCS metadata to decide what files to put in the sdist) to help people form an initial view of which they prefer, but that's just a "nice to have".

[^1]: Personally, I really dislike backends that rely on VCS to populate the sdist. That's not relevant for this document, but may explain why I didn't simply commit the right files to git first time.

pfmoore avatar Nov 28 '25 21:11 pfmoore

Also, the maturin example doesn't mention that the source code needs to be named src/lib.rs. Or that the built module will be imported as _core._core (which to be honest, strikes me as a weird name to use for the example...).

pfmoore avatar Nov 28 '25 21:11 pfmoore

  1. Fixed, thanks! This is one part that's not rendered from the cookiecutter, since I haven't implemented selecting a portion of a toml file.
  2. Only meson requires those. Scikit-build-core is friendlier for beginners, IMO.
  3. TODO
  4. Only meson requires actual git; scikit-build-core is more like hatchling, and will use the .gitignore if available, along with some defaults. In the future, there will also be a configuration for file selection. scikit-build-core is friendlier for beginners, IMO.
  5. Only meson requires the correct path, CMake can find MSVC; scikit-build-core is friendlier for beginners, IMO.
  6. (Filename) TODO

scikit-build-core is friendlier for beginners, IMO - but I'm just a little bit biased there. ;)

I've addressed the easiest ones in #711, I can try to address the others a bit more in a followup. Thanks for the experienced+beginner review!

henryiii avatar Nov 29 '25 20:11 henryiii

Only meson requires those. Scikit-build-core is friendlier for beginners, IMO.

The pyproject.toml references them, so they should be present for all backends, even if the backend doesn't enforce it.

Only meson requires actual git

Yeah, I worked that out (and for me, it's a strong argument for scikit-build-core, as I don't like requiring git).

scikit-build-core is friendlier for beginners, IMO

I think I agree. The CMake syntax (and to be honest, the language as a whole) is less friendly, but a beginner is probably just going to copy the boilerplate and be happy. So that won't matter, at least in the short term.

CMake can find MSVC

I think I hit the problem with CMake, but my situation is unusual. I have MSVC, but I also have a copy of gcc and that is on my PATH. So I suspect what happened is that CMake preferred gcc, because it's on my path, but in fact gcc is the wrong choice, because it links in libstc++, which isn't supplied with Python (on Windows). I'm happy to accept that's an "advanced" issue, but if there's a way to tell CMake "prefer MSVC if there's a choice", that might be worth adding to the example CMakeList.txt, simply because the issue is hard for a beginner to debug if they hit it.

Thanks for the replies - and for the document!

On a somewhet related note, would you be willing to contribute this document to the packaging user guide? I think it would be a good fit there. I'd be willing to help with the process if you want.

pfmoore avatar Nov 29 '25 21:11 pfmoore

The pyproject.toml references them

Oh, it does, yes. :facepalm:

The CMake syntax (and to be honest, the language as a whole) is less friendly

Modern CMake isn't bad - the syntax does look a lot less modern (more bash-like, some familiarity there), but the target-based language works very similarly to Meson. It's just you also have the old directory-based language and globals too, and lots of copy-paste examples of that. My modern CMake tries to avoid bad examples. ;)

For more advanced uses, like a lot of scientific code, CMake really shines; meson doesn't let you define functions, it has options on the meanings of things like -system, etc. For a multi-million line codebase with over a thousand authors, most of whom are physicists that just program because they have to, being able to give them a custom function so they can just write 7 lines of CMake to add their algorithm is really, really nice.

To be clear, I do understand and appreciate Meson's approach; forcing the configuration to not hide what it's doing also has benefits. Meson does a great job avoiding bad practices that CMake still allows. Both CMake and Meson are better than manually building everything yourself (which is basically what setuptools forces you to do unless it's a very simple, single threaded build of a C only no dependency extension). Both are way, way better than autotools, too!

my situation is unusual

On GHA, CMake (and scikit-build-core) will find MSVC even though the runners also have GCC. While Meson (and I think meson-python) will find the GCC copy, and require you run something to setup the msvc environment to find MSVC. That's my point of reference.

would you be willing to contribute this document to the packaging user guide

I wrote quite a bit of the current tutorial on pure Python there, that should look a bit similar. My main concern is that I don't want to add new pages; there are already too many pages and many of them are unmaintained. I think https://packaging.python.org/en/latest/guides/packaging-binary-extensions/ should be updated - "last reviewed 2013" isn't quite accurate (I've updated some things on that page in the last few years), but that's what happens when you have too many pages. (The is not a page over specific things that will not be relevant eventually, like the 2->3 transition, or setuptools specific pages, at least).

I'd be willing to work on this if I can get a reviewer to support it, IIRC the main issue with updating that page in the past was no one was very willing to review binary packaging updates.

henryiii avatar Dec 02 '25 15:12 henryiii

On GHA, CMake (and scikit-build-core) will find MSVC even though the runners also have GCC.

Odd. Maybe I misremembered. Let me re-check and see if I can work out what's going on. I'll report back.

I think https://packaging.python.org/en/latest/guides/packaging-binary-extensions/ should be updated

Yes, that's a good place for it. IMO, the current page there is pretty bad - it spends a bunch of time discussing alternatives and approaches, but never actually gives any usable advice 🙁

I'd be willing to work on this if I can get a reviewer to support it

I'm happy to review it - it'll be more from the POV of an inexperienced user wanting to get started packaging up some C code, so I won't spot technical issues (unless they break things!). But IMO that should be a good start.

I'm not sure who has commit access on that repo. I'm not in the editors group (although I can see who is in that group, meaning I presubamly have some sort of access via my PyPA admin status?) so we'd need to find someone willing to commit any changes. I don't know how active the editors are these days - that may be another part of the problem, TBH. But we can cross that bridge when we get to it...

pfmoore avatar Dec 02 '25 17:12 pfmoore

Or that the built module will be imported as _core._core

It seems to just be package._core? The tests in the template pass:

import package._core as m


def test_add():
    assert m.add(2, 3) == 5

henryiii avatar Dec 03 '25 21:12 henryiii

Again, I don't think that's what I saw, but I'll re-check.

pfmoore avatar Dec 03 '25 22:12 pfmoore

On GHA, CMake (and scikit-build-core) will find MSVC even though the runners also have GCC.

Odd. Maybe I misremembered. Let me re-check and see if I can work out what's going on. I'll report back.

Nope, it does pick up gcc rather than MSVC:

❯ uv run --with build python -m build
      Built package @ file:///C:/Work/Scratch/docs_testing
Installed 2 packages in 34ms
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
  - pybind11
  - scikit-build-core
* Getting build dependencies for sdist...
* Building sdist...
*** scikit-build-core 0.11.6 (sdist)
* Building wheel from sdist
* Creating isolated environment: venv+pip...
* Installing packages in isolated environment:
  - pybind11
  - scikit-build-core
* Getting build dependencies for wheel...
* Building wheel...
*** scikit-build-core 0.11.6 using CMake 4.2.0 (wheel)
*** Configuring CMake...
loading initial cache file C:\Users\Gustav\AppData\Local\Temp\tmpk4204vs6\build\CMakeInit.txt
-- Building for: Ninja
-- The CXX compiler identification is GNU 15.2.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: C:/Users/Gustav/scoop/apps/mingw-winlibs-ucrt/current/bin/c++.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Python: C:/Users/Gustav/AppData/Local/Temp/build-env-rz2lfjwt/Scripts/python.exe (found suitable version "3.14.0", minimum required is "3.8") found components: Interpreter Development.Module Development.Embed
-- pybind11::lto disabled (problems with undefined symbols for MinGW for now)
-- pybind11::thin_lto disabled (problems with undefined symbols for MinGW for now)
-- Found pybind11: C:/Users/Gustav/AppData/Local/Temp/build-env-rz2lfjwt/Lib/site-packages/pybind11/include (found version "3.0.1")
-- Configuring done (2.1s)
-- Generating done (0.0s)
-- Build files have been written to: C:/Users/Gustav/AppData/Local/Temp/tmpk4204vs6/build
*** Building project with Ninja...
[2/2] Linking CXX shared module _core.cp314-win_amd64.pyd
*** Installing project into wheel...
-- Install configuration: "Release"
-- Installing: C:\Users\Gustav\AppData\Local\Temp\tmpk4204vs6\wheel\platlib/package/_core.cp314-win_amd64.pyd
*** Making wheel...
*** Created package-0.1-cp314-cp314-win_amd64.whl
Successfully built package-0.1.tar.gz and package-0.1-cp314-cp314-win_amd64.whl

Note the line "The CXX compiler identification is GNU 15.2.0".

And g++ is on my PATH:

❯ (gcm g++).source
C:\Users\Gustav\scoop\apps\mingw-winlibs-ucrt\current\bin\g++.exe
❯ $env:PATH -split ';'
...
C:\Users\Gustav\scoop\apps\mingw-winlibs-ucrt\current\bin
...

MSVC is not on my path:

❯ cl /?
cl: The term 'cl' is not recognized as a name of a cmdlet, function, script file, or executable program.
Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

but it is installed:

❯ winget list | grep -i "Visual Studio"
Microsoft Visual Studio Installer          ARP\Machine\X64\{6F320B93-EE3C-4826-85E0-… 3.12.2320.19252
Visual Studio Professional 2019            Microsoft.VisualStudio.2019.Professional   16.11.10         16.11.53  winget
Visual Studio Community 2022               Microsoft.VisualStudio.2022.Community      < 17.14.6        17.14.21  winget
Microsoft Visual Studio Code (User)        Microsoft.VisualStudioCode                 1.106.1          1.106.3   winget

Again, I don't think that's what I saw, but I'll re-check.

I can't reproduce the issue of having to import _core._core. I must have been doing something wrong in my testing. Sorry about that.

(On an unrelated note, uv installs the package in editable mode, and meson's editable mechanism fails horribly. It seems to be trying to rebuild the package, for some reason - I have no idea why as I've not changed anything. I guess that's just one more reason to avoid meson... Scikit-build's editable mode seems to work fine.)

pfmoore avatar Dec 03 '25 23:12 pfmoore

If you run cmake --help, what generator has a * on it? We use that one currently. Scikit-build (classic) inspected the registry for MSVC installs, we don't do that in scikit-build-core, instead taking CMake's default selection, which is usually good at finding MSVC. I thought CMake did that too, but maybe it does require it on the path, just not fully set up via vcvarsall.bat, while meson-python requires full setup?

henryiii avatar Dec 04 '25 15:12 henryiii

If you run cmake --help, what generator has a * on it?

Ninja.

If I remove the directory containing gcc from my PATH, I get "Visual Studio 17 2022". So to that extent, I guess it is down to my unusual environment. Sigh, maybe it's time for me to set up some sort of "pick a C compiler" shell script...

Also, does that imply that I need my own copy of CMake installed for scikit-build-core? Because cmake is on PyPI, I assumed the build backend would automatically install a fresh copy for me.

As far as the default is concerned, how do I change that? The only option I can see is to set a CMAKE_GENERATOR environment variable, which feels a bit clunky to me (and not ideal for a beginner to have to set up config like that). Given that the standard python.org builds of Python use MSVC, and that's the officially supported compiler on Windows, I think the documentation (and the build system) should prioritise that. This isn't the right place to discuss scikit-build-core's behaviour, but I do think the documentation should lead the user to reliably get MSVC based builds.

My preference would be something like this:

In order to build a C/C++ extension, you need a C compiler. On Unix, (something here that I can't comment on). On Windows, Visual Studio is the officially supported compiler, which is available as a free download from Microsoft. You can use other compilers, with care, but if you are just starting out, you should stick with Visual Studio. To ensure the build backend locates the correct C compiler (if you happen to have more than one installed) you should ...

The simplest thing to put after "you should" is "ensure that your preferred compiler is on your PATH", with a note that for Visual Studio this means using a MSVC developer prompt, or run the vcvarsall script. But I personally find using the developer prompt frustrating, because it means starting a new terminal window, and hence losing all the context I have in my main development window[^1]. So advice on how to set the preference without needing a new window would be a definite advantage. But I accept that it's somewhat secondary to the main purpose of the document, which is to get people up and running.

[^1]: To the extent that needing to do that counts as a significant downside when choosing a build backend.

pfmoore avatar Dec 04 '25 16:12 pfmoore

Ninja

Ah, that explains it. I believe if ninja wasn't on your path, MSVC would be used. Ninja supports both MSVC and GCC, but it needs setup for MSVC. If you remove ninja, I think it will pick up MSVC. There's a chance the copy of ninja distributed with MSVC is on the path, though. That's a CMake default, not sure I like it.

You can enforce MSVC in scikit-build-core:

[[tool.scikit-build.overrides]]
if.platform-system = "win32"
cmake.args = ["-GVisual Studio 17 2022"]

(Untested) I don't think there's a way to avoid hardcoding the VS version there, though. I've avoided adding yet another configuration option for generator, but this might be a reason to look into options here. If there's a way to get the ninja generator to select MSVC if installed, that would be best.

Hmm, I wonder if this works, actually? It might still require vcvarsall.bat, but maybe not?

[[tool.scikit-build.overrides]]
if.platform-system = "win32"
cmake.define.CMAKE_C_COMPILER = "cl.exe"
cmake.define.CMAKE_CXX_COMPILER = "cl.exe"

does that imply that I need my own copy of CMake

No, external copies are only used if they satisfy your version requirements (read from your CMakeLists, or 3.15+ by default, configurable); it will request cmake from PyPI otherwise. On platforms without wheels, it's important to be able to pick up an external CMake. Same goes for ninja on non-Windows platforms. Meson-python does the same thing with ninja (probably the largest bit of work I've contributed to meson-python, actually).

henryiii avatar Dec 04 '25 19:12 henryiii

By the way, scikit-build-core also has a getting started page, where the tabs cover a collection of common binding methods instead: https://scikit-build-core.readthedocs.io/en/latest/guide/getting_started.html

henryiii avatar Dec 04 '25 20:12 henryiii

Ah, that explains it. I believe if ninja wasn't on your path, MSVC would be used.

Ah, I didn't even know I had ninja installed. Looks like it comes with the distribution of gcc I'm using. As you may have worked out by now, I'm a bit of a "dabbler" when it comes to C build systems. I don't do much C, and every time I do, I get frustrated at having to do my compiles by hand, so I look at build systems and end up hating them all pretty much equally. Same with compilers (I have all of MSVC, gcc and clang installed in one place or another). But the only compiler I think of as "default" is MSVC, because it's the only one that's properly installed as a Windows application (the others are all just binaries, unzipped and added to my PATH at some point).

Hmm, I wonder if this works, actually? It might still require vcvarsall.bat, but maybe not?

Nope, it definitely needs cl.exe to be on PATH (i.e., it needs vcvarsall), and if I'm messing with PATH, removing gcc/ninja is probably just as easy.

❯ uv run --with build python -m build
Using CPython 3.14.0 interpreter at: C:\Users\Gustav\AppData\Local\Python\pythoncore-3.14-64\python.exe
Creating virtual environment at: .venv
  × Failed to build `package @ file:///C:/Work/Scratch/docs_testing/skb`
  ├─▶ The build backend returned an error
  ╰─▶ Call to `scikit_build_core.build.build_editable` failed (exit code: 1)

      [stdout]
      *** scikit-build-core 0.11.6 using CMake 4.2.0 (editable)
      *** Configuring CMake...
      loading initial cache file C:\Users\Gustav\AppData\Local\Temp\tmpxyh2nml9\build\CMakeInit.txt
      -- Building for: Ninja
      -- The CXX compiler identification is unknown
      -- Configuring incomplete, errors occurred!

      [stderr]
      CMake Error at CMakeLists.txt:2 (project):
        The CMAKE_CXX_COMPILER:

          cl.exe

        is not a full path and was not found in the PATH.  Perhaps the extension is
        missing?

        Tell CMake where to find the compiler by setting either the environment
        variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path
        to the compiler, or to the compiler name if it is in the PATH.



      *** CMake configuration failed

      hint: This usually indicates a problem with the package or the build environment.

Overriding cmake.args does work without vcvarsall, though.

By the way, scikit-build-core also has a getting started page, where the tabs cover a collection of common binding methods

Nice!

pfmoore avatar Dec 04 '25 20:12 pfmoore