PythonCall.jl icon indicating copy to clipboard operation
PythonCall.jl copied to clipboard

Get module load time down

Open cjdoris opened this issue 4 years ago • 3 comments

Currently about 7 seconds, 80% compilation

cjdoris avatar Oct 02 '21 08:10 cjdoris

Now on 3.5 seconds, 70% compilation, by using a custom meta format instead of JSON.

cjdoris avatar Oct 19 '21 18:10 cjdoris

Now 2.5 seconds, 80% compilation, I think from lowering the optimisation level in CondaPkg and MicroMamba.

cjdoris avatar Jan 24 '22 19:01 cjdoris

I get:

julia> @time using PythonCall
  6.009497 seconds (4.98 M allocations: 296.350 MiB, 1.55% gc time, 89.52% compilation time)

still the package and CondaPkg and MicroMamba are at the latest version (didn't check app dependencies). Your machine simply that much faster?

I get down to 1 sec with --compile=min (I know you don't want that, understandable).

PallHaraldsson avatar Feb 18 '22 00:02 PallHaraldsson

This is what I get with 1.8.3 on linux:

julia> @time using PythonCall
  8.890879 seconds (7.62 M allocations: 441.547 MiB, 5.14% gc time, 88.10% compilation time: 72% of which was recompilation)

Not a blocker for transitioning Plotss matplotlib backend to PythonPlot in https://github.com/JuliaPlots/Plots.jl/pull/4542 (see https://github.com/JuliaPlots/Plots.jl/pull/4542#issuecomment-1326595934), but reducing load time to reach parity with PyPlot / PyCall would be nice, since I intent to make the pythonplot backend supersede the pyplot one.

t-bltg avatar Nov 25 '22 12:11 t-bltg

I'm not at the computer but I think for me the import time is currently about 3.5 seconds - so it's gone up, but indeed still faster than your computer apparently. I'm on Windows and SSD..?

Load time is low on my personal priority list, but I investigate it now and then. Help investigating the slow bits would be appreciated.

cjdoris avatar Dec 01 '22 16:12 cjdoris

linux, ssd too:

julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, sandybridge)
  Threads: 1 on 32 virtual cores

t-bltg avatar Dec 01 '22 16:12 t-bltg

Wait, I think the previous timings were wrong, I might have had a JULIA_PYTHONCALL_EXE=@PyCall in my environment :/

With JULIA_PYTHONCALL_EXE=@PyCall:

julia> @time @time_imports using PythonCall
     53.7 ms  MacroTools
      0.2 ms  DataValueInterfaces
      1.7 ms  DataAPI
      0.1 ms  IteratorInterfaceExtensions
      0.1 ms  TableTraits
     11.6 ms  OrderedCollections
     21.6 ms  Tables
      0.4 ms  Requires
      0.3 ms  Scratch
     35.7 ms  Preferences
      0.5 ms  JLLWrappers
      5.5 ms  micromamba_jll 90.29% compilation time
      0.3 ms  MicroMamba
      0.1 ms  SnoopPrecompile
     87.6 ms  Parsers 12.07% compilation time
     14.6 ms  StructTypes
     83.4 ms  JSON3
      0.4 ms  Pidfile
     21.6 ms  CondaPkg 87.25% compilation time
     14.9 ms  UnsafePointers
     19.9 ms  VersionParsing
     24.7 ms  JSON
      9.0 ms  Conda
    945.2 ms  PyCall 78.96% compilation time (6% recompilation)
   8548.2 ms  PythonCall 92.25% compilation time (74% recompilation)
  9.182004 seconds (8.43 M allocations: 482.731 MiB, 5.13% gc time, 88.68% compilation time: 71% of which was recompilation)

With JULIA_PYTHONCALL_EXE=[...]/bin/python: Now, closer to your timings (my Xeon CPU is not recent):

julia> @time @time_imports using PythonCall
     42.9 ms  MacroTools
      0.1 ms  DataValueInterfaces
      1.4 ms  DataAPI
      0.1 ms  IteratorInterfaceExtensions
      0.1 ms  TableTraits
      9.2 ms  OrderedCollections
     16.8 ms  Tables
      0.3 ms  Requires
      0.2 ms  Scratch
     27.9 ms  Preferences
      0.4 ms  JLLWrappers
      4.0 ms  micromamba_jll 87.60% compilation time
      0.3 ms  MicroMamba
      0.1 ms  SnoopPrecompile
     90.1 ms  Parsers 12.72% compilation time
     14.4 ms  StructTypes
     83.8 ms  JSON3
      0.5 ms  Pidfile
     21.7 ms  CondaPkg 87.59% compilation time
     14.9 ms  UnsafePointers
   3341.8 ms  PythonCall 85.17% compilation time (<1% recompilation)
  3.905634 seconds (3.16 M allocations: 210.015 MiB, 2.20% gc time, 78.55% compilation time: 0% of which was recompilation)

This is a bit peculiar because the numbers do not just add, since the impact of loading PyCall is around 1.5s (all measurements made in a new session, of course):

julia> @time @time_imports using PyCall
     23.0 ms  VersionParsing
      0.2 ms  SnoopPrecompile
    105.4 ms  Parsers 12.46% compilation time
     22.7 ms  JSON
      8.2 ms  Conda
     38.6 ms  MacroTools
    894.5 ms  PyCall 80.50% compilation time
  1.360677 seconds (1.69 M allocations: 127.477 MiB, 3.09% gc time, 70.89% compilation time: 1% of which was recompilation)

We expect about 3.9s + 1.4s ~= 5.3s, not 9.2s. Maybe some invalidations occurring ?

t-bltg avatar Dec 01 '22 16:12 t-bltg

Ah interesting. Maybe invalidations yeah, not sure what though.

Or maybe it's Requires.jl, which is used to check compatibility with PyCall when it's loaded - but will probably require compilation when used.

cjdoris avatar Dec 01 '22 17:12 cjdoris

Maybe Requires yes, since we use it also in Plots to load glue code for each backend.

The following is disappointing: without setting JULIA_PYTHONCALL_EXE=@PyCall, the timings in https://github.com/JuliaPlots/Plots.jl/pull/4542#issuecomment-1326595934 remain very similar, with a ~6s overhead for pythonplot w.r.t. pyplot.

I'll try to investigate.

EDIT: Ok, so PyCall is still loaded, and that is weird, because it shouldn't be.

julia> using Plots
julia> @time pythonplot()
     75.2 ms  MacroTools
      0.2 ms  DataValueInterfaces
      0.2 ms  IteratorInterfaceExtensions
      0.1 ms  TableTraits
     37.8 ms  Tables
      0.5 ms  micromamba_jll
      0.3 ms  MicroMamba
     23.0 ms  StructTypes
    161.1 ms  JSON3
      0.5 ms  Pidfile
     83.3 ms  CondaPkg 96.11% compilation time (78% recompilation)
     25.9 ms  UnsafePointers
   4636.6 ms  PythonCall 84.21% compilation time (32% recompilation)
     17.3 ms  VersionParsing
      9.5 ms  Conda
    979.3 ms  PyCall 70.39% compilation time (6% recompilation)
      1.5 ms  LaTeXStrings
   4092.8 ms  PythonPlot 77.86% compilation time (36% recompilation)
 22.286516 seconds (23.90 M allocations: 1.338 GiB, 3.07% gc time, 86.22% compilation time: 43% of which was recompilation)

Narrowed down to PythonPlot loading PyCall (but again why ??):

@time @time_imports using PythonPlot
     49.9 ms  MacroTools
      0.2 ms  DataValueInterfaces
      2.3 ms  DataAPI
      0.1 ms  IteratorInterfaceExtensions
      0.1 ms  TableTraits
     23.1 ms  Tables
      0.4 ms  Requires
      0.3 ms  Scratch
     35.7 ms  Preferences
      0.5 ms  JLLWrappers
      4.8 ms  micromamba_jll 88.75% compilation time
      0.3 ms  MicroMamba
      0.1 ms  SnoopPrecompile
     93.4 ms  Parsers 11.61% compilation time
     16.2 ms  StructTypes
    233.8 ms  JSON3
      0.5 ms  Pidfile
     22.0 ms  CondaPkg 86.64% compilation time
     17.2 ms  UnsafePointers
   3400.1 ms  PythonCall 86.29% compilation time (<1% recompilation)
     15.6 ms  VersionParsing
     21.9 ms  JSON
      8.7 ms  Conda
    880.6 ms  PyCall 77.36% compilation time (6% recompilation)
     94.8 ms  FixedPointNumbers
      0.2 ms  Reexport
    118.4 ms  ColorTypes 6.93% compilation time
    366.3 ms  Colors
      1.1 ms  LaTeXStrings
   4195.4 ms  PythonPlot 79.97% compilation time (5% recompilation)
 13.508075 seconds (13.46 M allocations: 753.768 MiB, 5.51% gc time, 80.11% compilation time: 34% of which was recompilation)

t-bltg avatar Dec 01 '22 17:12 t-bltg

A question: why would you want to @require PyCall="438e738f-606a-5dbb-bf0a-cddfbfd45ab0" init_pycall(PyCall), if JULIA_PYTHONCALL_EXE is explicitly set by the user ?

t-bltg avatar Dec 01 '22 17:12 t-bltg

My bad, probably a stale precompilation cache:

julia> using PythonCall, PythonPlot
julia> rm.(Base.find_all_in_cache_path(Base.module_keys[PythonCall]));
julia> rm.(Base.find_all_in_cache_path(Base.module_keys[PythonPlot]));
# restart
julia> using PythonPlot  # precompiles
# restart
julia> using Plots
julia> @time pythonplot();
 16.612395 seconds (15.54 M allocations: 944.050 MiB, 2.86% gc time, 84.67% compilation time: 31% of which was recompilation)

Parity with pyplot (~16.1s) :tada: .

t-bltg avatar Dec 01 '22 17:12 t-bltg

I just downloaded master Julia 1.10, but it's "2 days old master". I think it may be too old to test, I believe recently merged important PR is missing in my master. I would test with the next updated master, and then hopefully the problem will be fixed to a large degree, so not as urgent to work on this.

Anyway I get 5.977690 seconds on my (somewhat loaded) machine.

PallHaraldsson avatar Dec 02 '22 17:12 PallHaraldsson

Getting better down to:

julia> @time using PythonCall
  2.297766 seconds (1.46 M allocations: 91.242 MiB, 5.38% gc time, 1.60% compilation time)

You can get 1.81 sec. with -O1, which I think is ok to set at the module level, or even lower with --compile=min then 1.332 sec but you likely don't want to set that at module level.

julia> @time_imports using PythonCall  # these timings always show lower, assuming is correct and includes time for its dependencies, and note, it will be even lower with julia -O1, what I' showing here:
     17.8 ms  MacroTools
      0.2 ms  DataValueInterfaces
      1.3 ms  DataAPI
      0.2 ms  IteratorInterfaceExtensions
      0.1 ms  TableTraits
      9.8 ms  OrderedCollections
     39.6 ms  Tables
      0.4 ms  Requires
      0.3 ms  Scratch
     17.4 ms  Preferences
      0.4 ms  JLLWrappers
      2.6 ms  micromamba_jll 83.14% compilation time
      0.3 ms  MicroMamba
      0.2 ms  SnoopPrecompile
     77.5 ms  Parsers
     10.5 ms  StructTypes
     77.3 ms  JSON3
      0.4 ms  Pidfile
     18.6 ms  CondaPkg 33.29% compilation time
     16.1 ms  UnsafePointers
   1396.3 ms  PythonCall

Would adding weakdep feature to Projects.toml possibly help?

PallHaraldsson avatar Apr 09 '23 16:04 PallHaraldsson