PythonWin7 icon indicating copy to clipboard operation
PythonWin7 copied to clipboard

Why is yt-dlp built on Python 3.14.0 slower than 3.13.9? Why more bytes read?

Open barkoder opened this issue 2 months ago • 7 comments

I built yt-dlp --onedir , using both 3.13.9 and 3.14.0 in separate venvs using identical options.

The py3.14.0 yt-dlp.exe consistently performs worse than the py 3.13.9 yt-dlp.exe

$ for i in $(seq 1 10) ; \
do echo -n PY314_ONEDIR- ; \
time C:/yt-dlp314/yt-dlp314.exe --no-config -v 2>&1 | grep real ; \
echo -n PY313_ONEDIR- ; \
time C:/yt-dlp313/yt-dlp313.exe --no-config -v 2>&1 | grep real ; \
echo '----'  ; \
done

PY314_ONEDIR-real       0m 2.28s
PY313_ONEDIR-real       0m 2.06s
----
PY314_ONEDIR-real       0m 2.13s
PY313_ONEDIR-real       0m 2.05s
----
PY314_ONEDIR-real       0m 2.13s
PY313_ONEDIR-real       0m 2.06s
----
PY314_ONEDIR-real       0m 2.13s
PY313_ONEDIR-real       0m 2.06s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.05s
----
PY314_ONEDIR-real       0m 2.14s
PY313_ONEDIR-real       0m 2.06s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.06s
----
PY314_ONEDIR-real       0m 2.13s
PY313_ONEDIR-real       0m 2.07s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.06s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.07s
----

So I fired up File Activity Watch and logged what was being read. I've attached the logs. - yt-dlp314_vs._313-FILE_ACTIVITY-report.html.zip

yt-dlp313.exe(just the process) only registers ~22MiB of read, despite having more files overall.

yt-dlp314.exe(just the process) registers almost ~80MiB of read. Why?

I don't mind attaching my binaries, if required.

Thanks for any insight.

barkoder avatar Nov 09 '25 03:11 barkoder

Out of curiosity I also built yt-dlp --onedir using py3.12.12 ,py3.11.14 and py3.10.19 as well.

Here are the benchmarks.

PY314_ONEDIR-real       0m 2.11s
PY313_ONEDIR-real       0m 2.04s
PY312_ONEDIR-real       0m 2.03s
PY311_ONEDIR-real       0m 2.04s
PY310_ONEDIR-real       0m 1.93s
----
PY314_ONEDIR-real       0m 2.11s
PY313_ONEDIR-real       0m 2.04s
PY312_ONEDIR-real       0m 2.06s
PY311_ONEDIR-real       0m 2.04s
PY310_ONEDIR-real       0m 1.92s
----
PY314_ONEDIR-real       0m 2.15s
PY313_ONEDIR-real       0m 2.03s
PY312_ONEDIR-real       0m 2.06s
PY311_ONEDIR-real       0m 2.00s
PY310_ONEDIR-real       0m 1.93s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.05s
PY312_ONEDIR-real       0m 2.04s
PY311_ONEDIR-real       0m 2.05s
PY310_ONEDIR-real       0m 1.94s
----
PY314_ONEDIR-real       0m 2.13s
PY313_ONEDIR-real       0m 2.02s
PY312_ONEDIR-real       0m 2.02s
PY311_ONEDIR-real       0m 2.03s
PY310_ONEDIR-real       0m 1.94s
----
PY314_ONEDIR-real       0m 2.13s
PY313_ONEDIR-real       0m 2.05s
PY312_ONEDIR-real       0m 2.02s
PY311_ONEDIR-real       0m 2.04s
PY310_ONEDIR-real       0m 1.92s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.03s
PY312_ONEDIR-real       0m 2.03s
PY311_ONEDIR-real       0m 2.02s
PY310_ONEDIR-real       0m 1.92s
----
PY314_ONEDIR-real       0m 2.09s
PY313_ONEDIR-real       0m 2.03s
PY312_ONEDIR-real       0m 2.02s
PY311_ONEDIR-real       0m 2.02s
PY310_ONEDIR-real       0m 1.93s
----
PY314_ONEDIR-real       0m 2.10s
PY313_ONEDIR-real       0m 2.02s
PY312_ONEDIR-real       0m 2.02s
PY311_ONEDIR-real       0m 2.00s
PY310_ONEDIR-real       0m 1.93s
----
PY314_ONEDIR-real       0m 2.12s
PY313_ONEDIR-real       0m 2.05s
PY312_ONEDIR-real       0m 2.02s
PY311_ONEDIR-real       0m 2.02s
PY310_ONEDIR-real       0m 1.93s
----

YT-DLP ONEDIR PY310 AVERAGE- 1.929s YT-DLP ONEDIR PY311 AVERAGE- 2.026s YT-DLP ONEDIR PY312 AVERAGE- 2.032s YT-DLP ONEDIR PY313 AVERAGE- 2.036s YT-DLP ONEDIR PY314 AVERAGE- 2.118s

310

  • extremely fast, as is to be expected due to its lower size.
  • It reads 7 files on first run.
  • yt-dlp310.exe(process) registers 13,813.7 KiB of read.

311

  • is much slower than 310.
  • It also reads 7 files on first run.
  • yt-dlp311.exe(process) registers 18,982.1 KiB of read.

312

  • is close enough in speed to 311.
  • It also reads 7 files on first run.
  • yt-dlp312.exe(process) registers 17,680.2 KiB of read.

313

  • is close enough in speed to 312 and 311.
  • It reads a whopping 64 files on first run. Remarkable that 313's performance is still close enough to 312.
  • yt-dlp313.exe(process) registers 22,542.5 KiB of read.

314

  • 314 is much slower.
  • It reads 36 files on first run.
  • yt-dlp314.exe(process) registers a weirdly HUGE 80,549.9 KiB of read.

File activity reports for 310, 311, 312 - yt-dlp310_vs._311_vs._312-FILE_ACTIVITY-report.html.zip

See OP for reports of 313 and 314

barkoder avatar Nov 09 '25 19:11 barkoder

None of my patches would explain an increase in amount read in Python 3.14.0, so this is likely due to some upstream changes in Python 3.14. I would be curious to know whether the same results hold true for the official Python releases.

adang1345 avatar Nov 12 '25 01:11 adang1345

I built a yt-dlp onedir using the official 3.14 python on Windows 7.

I had to VxKex-NEXT the official python installer, the installed python.exe, and the built yt-dlp.exe to get it working on Windows 7x64.

yt-dlp314OFFICIAL.exe(process) registers ~73,426 KiB of read. FileActivityWatch report - yt-dlp314OFFICIAL_FILE_ACTIVITY-report.html.zip

I also tried building without upx.exe . Same result.

Appears to be an issue with 3.14.0 upstream, but I can't be 100% sure about it since I'm on Windows 7, and I had to do a lot of VxKex-ing.

Ideally this test would be run on an officially supported operating system.

barkoder avatar Nov 12 '25 17:11 barkoder

had to VxKex-NEXT the official python installer,

I don't recommend that fork of VxKex, it's focused on porting it to Chinese rather than any actual features, and its development is heavily AI-driven and full of issues because of that. (Personally, I believe it was forked just for the purpose of trying to get some donations off a project the forker didn't do.)

Try either the venerable i486 archive of the original effort, which is very reliable and works very well...: https://github.com/i486/VxKex

... Or the fork of the sharpest, latest developer that took over the mantle since, "dotexe", from his website (source code available over FTP, too): http://thrashnet.org/projects/vxkex/ (Used to be on GitHub, too, at https://github.com/dotexe1337/VxKex , but that page was since plugged off to just keep the effort centralized on the personal site.)

You should retry your Python 3.14.0 tests with either of these 2 versions of VxKex.

donotsdubba avatar Nov 16 '25 09:11 donotsdubba

@donotsdubba

I don't recommend that fork of VxKex, it's focused on porting it to Chinese rather than any actual features,

It's the only version that runs the latest version of deno(deno 3.0.0-rc.0) and node(v26.0.0-nightly) - See the off-topic hidden discussion here - https://github.com/yt-dlp/yt-dlp/issues/14847#issuecomment-3476613607 I've confirmed that the other VxKex forks fail. Please read the entire off-topic discussion in that link, before reading this comment further.

and its development is heavily AI-driven

What is your evidence for this?

and full of issues because of that.

I'm aware of the other VxKex versions, as you might have gathered from the discussion I've linked above. It's just that they are no longer fit for my particular use cases.

For example, I had created an issue here in dotexe1337's repo - https://github.com/dotexe1337/VxKex/issues/46 , with the title [Not working] Firefox 140 portable + VxKex dotexe version 20250531 . I was using Tor Browser 14.5.9(based on Firefox 128.0) at the time, which was actually working with dotexe1337's VxKex version 20250531. So in order to fix the issue of Firefox 140 not working, dotexe1337 released an update, which also broke my working copy of Tor Browser 14.5.9(based on Firefox 128.0), which again, was working fine with dotexe1337's VxKex 20250531. So I downgraded back to 20250531 and I commented there telling dotexe1337 that their update broke my working Firefox 128.0. Didn't hear back from dotexe1337. Eventually, dotexe1337 deleted their account and VxKex fork repo from github. Making it only available outside of github(source code only available over FTP no less.)

Also just to confirm, I just tested the latest version of dotexe1337's fork(20251018) from the thrashnet website. It still crashes with deno versions >v2.0.6. And crashes Tor Browser 14.5.9(Firefox128.0). VxKex-NEXT by contrast works with all Deno versions(I've tested upto 3.0.0-rc.0) and Tor Browser 14.5.9(Firefox128.0). VxKex-NEXT doesn't work with Firefox 140.0 either but that is a known issue

(Personally, I believe it was forked just for the purpose of trying to get some donations off a project the forker didn't do.)

If this was the case, then Deno versions >2.0.6 wouldn't work with VxKex-NEXT. But it does.

Or the fork of the sharpest, latest developer that took over the mantle since, "dotexe", from his website (source code available over FTP, too):

Making the source code difficult to get(only available over FTP) does not instill confidence. There's absolutely no reason why dotexe1337's VxKex source code cannot be made available over plain HTTP. The binaries are available over plain http. Why not the source code too?

You should retry your Python 3.14.0 tests with either of these 2 versions of VxKex.

At this point, I'm pretty sure that this is an issue with upstream Python. I just need someone to confirm it on native Win10 or above using official CPython, and I'll open a ticket on the upstream issue tracker.


Based on my testing of all available VxKex forks, VxKex-NEXT(tested till v1.1.3.1763 at the time of writing) is the only viable solution at the moment. That may well change in the future, and I sure hope it does.

barkoder avatar Nov 18 '25 01:11 barkoder

What is your evidence for this?

The developer himself: he, vxiiduu, i486 and dotexe (AKA Kryptic), among others, are all part of the same Windows 7/Vista/XP-centric Discord server. Dotexe and vxiiduu both have provided help and support to the developer of VxKex-NEXT whenever he needed help (though not always through Discord AFAIK), and that's how we all know.

By the way, the "vxiiduu" account and repo you linked to in the other thread you just told me to check is NOT the same as the current account and GitHub repo you linked to as being him and his on the other thread, now deleted anyway, as per warning in i486's GitHub mirror (which, incidentally, is not a fork per se, but just a reliable archive of the original effort, contrary to what you said).

Eventually, dotexe1337 deleted their account and VxKex fork repo from github.

Yeah, just like the original developer of VxKex did, vxiiduu. GitHub itself can be a PITA, people on GitHub can also be a PITA, and maintaining 2 locations to upload your work for each and every update is also a PITA, hence the deletion AFAIK.

If this was the case, then Deno versions >2.0.6 wouldn't work with VxKex-NEXT. But it does.

How so? One thing doesn't nullify the other: it IS the case the primary focus of VxKex-NEXT is Chinese localization, and it IS the case it is AI-driven and that the developer doesn't know all that well what exactly he's doing, and it IS the case that the GitHub page has MASSIVELY-BIG donation QR Codes at the bottom:

Image

Nothing wrong with making donations available to you for your work, but it does perfectly corroborate exactly what I said. By contrast, vxiiduu (the real one, not the fake account that i486 warned about), dotexe and i486 did not seek them (which they could have, especially when they are either the creator of VxKex or the biggest contributors to VxKex).

Making it only available outside of github(source code only available over FTP no less.)

Making the source code difficult to get(only available over FTP) does not instill confidence. There's absolutely no reason why dotexe1337's VxKex source code cannot be made available over plain HTTP. The binaries are available over plain http. Why not the source code too?

Sorry, but in what way, shape or form is a public FTP site hosting a file very straightforwardly "difficult to get"? You can rightfully even argue it's even EASIER to get than HTTP, because it doesn't require anywhere near as heavy a program (modern bloated webbrowser) and doesn't have to render a bunch of worthless data (visuals, pictures, telemetry, tracking, other unrelated links etc.).

Additionally, picking a location that is NOT GitHub is a plus on top of that, as that has major restrictions and overhead that don't apply to a personal page (let alone public FTP) that benefit no one in terms of accessibility (won't work on certain older browsers, for instance, plus heavy telemetry all throughout the website, being Microsoft property etc.).

The fact that, at least according to you, some things worked in VxKex-NEXT that did not with others is excellent, and does potentially give a bit of a purpose to the AI-driven fork, despite all its issues. Just remember whatever success there is to it is owed to both dotexe and vxiiduu themselves as direct contributors even for that fork, including but not limited to helping in cleaning the AI broken+messy code and getting updates to work, especially if you distrust dotexe over something like... FTP?

Anyway, it is true, however, that it is weird that Release/Debug binaries are available over HTTP at all when the source is FTP-only (both Release/Debug binaries can also be fetched over FTP, by the way). Maybe you can try e-mailing him directly to ask him as to why that is the case, and tell him it inspires less confidence in the nature of the work when it is that way. Maybe it's just 1 thing less for him to upload/update to his HTTP website, and that people generally just don't care to go and get the latest update each time, but who knows. Gotta ask him to find out.

Based on my testing of all available VxKex forks, VxKex-NEXT(tested till v1.1.3.1763 at the time of writing) is the only viable solution at the moment.

Don't forget to consider forks and other tweaks to make things work in Windows 7 without any add-ons such as VxKex, such as Node.js forks, which are an even more viable solution, at least for things that have these options.

Also keep in mind if something DOES need VxKex, and it will work across multiple VxKex forks, VxKex-NEXT is the least efficient and least stable fork as a rule of thumb, again due to its AI-driven development, even hackier nature and its e-begging incentive.

Of course, if absolutely nothing works for an use-case, but VxKex-NEXT just so happens to work, then by all means feel free to go for it. That's all the advice I can give to us Windows 7 users worldwide.

donotsdubba avatar Nov 18 '25 08:11 donotsdubba

@donotsdubba Would you mind posting screenshots? I have never had/used Discord as I don't agree with its TOS, and also because Discord chat is not wayback machine archivable. I'll respond to the rest of your comment afterwards, as my reply is contingent on the veracity of your claim of 'heavily AI driven' development. So I shall respond after you post screenshots of the developer self-disclosing that they use AI. We(non-discord users) would all appreciate it. Thanks.

barkoder avatar Nov 18 '25 18:11 barkoder