Feature request: TUI feature to set `prefer` at the currently observed path
For example, I wanted to see paths which a file shares data with when its represented size is 0 (visible using --expert mode). Relaunching btdu with --prefer=FOLDER_WITH_FILE option does the job: btdu shows shared data files, because now the file is represented in the preferred path. However, it's quite tedious to copy the path I'm interested in, paste into command line and launch again...
Unfortunately we cannot retarget the representative samples post-hoc; the information necessary to do so is lost after ingestion, and retaining it would require considerably more memory usage. We could reset all sample counts thus making samples collected onwards reflect the new priority, which would technically work but is suboptimal UX.
Would you mind providing more detail about the use case (e.g. your filesystem layout, your ultimate goal, and how you'd like btdu help you achieve it)? I would like to think about how to approach solving this problem from a higher level.
We could reset all sample counts thus making samples collected onwards reflect the new priority
Yeah, this is what I would expect, avoiding application restart, making it more interactive.
Would you mind providing more detail about the use case (e.g. your filesystem layout, your ultimate goal, and how you'd like btdu help you achieve it)? I would like to think about how to approach solving this problem from a higher level.
Sure.
I have a folder WarCraft3 (at the bottom) which actual size is definitely more than represented (~100 MiB):
╦═══ /<SINGLE>/<DATA>/home/alex/.wine.nix ════════════════════════════════════════════════════════
║ ~493.5 MiB [####? ] /AgeOfEmpires
║ ~206.7 MiB [#? ] /Chkdraft
║ ~193.4 MiB [#? ] /StarCraft-1.16
║ ~5.698 GiB [####################################################?????] /StarCraft-Program-Files
║ ~1.651 GiB [##############??? ] /StarCraft.32
║ ~740.2 MiB [######?? ] /StarCraft.64
║ ~100.0 MiB [# ] /WarCraft3
═══ Selected: WarCraft3 ════════════════════════════════════╣
│ Size │ Samples ║
────────────┼───────────────────────┼─────────── ║
Represented │ ~100.0 MiB ±35.79 MiB │ 30 ║
Distributed │ ~551.8 MiB │ 165.500 ║
Exclusive │ ~ 0.0 B ±0 │ 0 ║
Shared │ ~1.849 GiB │ 568 ║
║
Full path: //home/alex/.wine.nix/WarCraft3 ║
Average query duration: 0.0000425 seconds ║
║
Latest offsets (represented samples): ║
║
n │ Physical │ Logical ║
────────┼──────────┼────────────── ║
# 29 │ - │ 739425185642 ║
# 28 │ - │ 809597068688 ║
# 27 │ - │ 679330739525 ║
│ │ • • • ║
Represented size is really expected to be lower than actual size (~2 GiB), because its contents were either originally copied using cp with --reflink (CoW clone) or "snapshotted" after.
I want to figure out where are the other filesystem locations of files which share the same data with WarCraft3 folder contents.
When I navigate to the biggest file war3.mpq in WarCraft3 folder its represented size is 0:
╦═══ /<SINGLE>/<DATA>/home/alex/.wine.nix/WarCraft3/drive_c/Program Files/WarCraft III ═
║ ~ 0.0 B [ ] /Maps
║ ~ 0.0 B [ ] /Movies
║ ~ 0.0 B [ ] War3Patch.mpq
║ ~ 0.0 B [ ] game.dll
║ ~ 0.0 B [ ] /redist
║ ~4.147 MiB [?????????????????????????????????????????????????????????] /save
║ ~ 0.0 B [ ] war3.mpq
And shared data paths information is not available:
═══ Selected: war3.mpq ═════════════════════════════════════╣
│ Size │ Samples ║
────────────┼───────────────────────┼─────────── ║
Represented │ ~ 0.0 B ±0 │ 0 ║
Distributed │ ~139.6 MiB │ 67.333 ║
Exclusive │ ~ 0.0 B ±0 │ 0 ║
Shared │ ~418.8 MiB │ 202 ║
║
Full path: //home/alex/.wine.nix/WarCraft3/drive_c/Program ║
Files/WarCraft III/war3.mpq ║
Average query duration: - ║
║
║
║
║
║
║
║
When I restart btdu this way sudo btdu --expert --prefer="/home/alex/.wine.nix/WarCraft3/drive_c/Program Files/WarCraft III" /, shared data paths information gets shown:
═══ Selected: war3.mpq ═════════════════════════════════════╣
│ Size │ Samples ║
────────────┼───────────────────────┼─────────── ║
Represented │ ~429.2 MiB ±60.53 MiB │ 193 ║
Distributed │ ~143.1 MiB │ 64.333 ║
Exclusive │ ~ 0.0 B ±0 │ 0 ║
Shared │ ~429.2 MiB │ 193 ║
║
Full path: //home/alex/.wine.nix/WarCraft3/drive_c/Program ║
Files/WarCraft III/war3.mpq ║
Average query duration: 0.0027065 seconds ║
║
Shares data with: ║
║
Path │ % │ Shared │ Samples║
──────────────────────────────┼──────┼────────────┼────────║
/.snapshots/20…t III/war3.mpq │ 100% │ ~429.2 MiB │ 193║
/home/alex/.wi…t III/war3.mpq │ 100% │ ~429.2 MiB │ 193║
/nix/store/11v….wine/war3.mpq │ 100% │ ~429.2 MiB │ 193║
║
Latest offsets (represented samples): ║
║
n │ Physical │ Logical ║
────────┼──────────┼────────────── ║
# 192 │ - │ 1373432306151 ║
# 191 │ - │ 1374320935407 ║
# 190 │ - │ 1373345790332 ║
│ │ • • • ║
And yay, I got the desired information!
Just I would like to exclude the btdu restart-with-prefer-option step. To be more precise, to migrate the step into TUI as a hotkey or something like that.
Generally, the feature of displaying all filesystem paths which share the same data of a currently selected file would be great. I expect there are caveats with consuming too much RAM when such information is remembered for each file/folder during the btdu run (sampling). So, doing this on demand for requested file/folder is probably the best way.
Thank you for that context! It was very useful.
And shared data paths information is not available:
This is the key moment - there's no reason why that should not have worked.
I had another look at this feature today and realized it's possible to implement it in a much more efficient and flexible way. Please try the version in master, it should now display "Shares data with" on all paths, not just representative ones.
The new data structure now also retains all information necessary to reconstruct representative samples with another ruleset (although not instantly; it would need to be reprocessed). If you still need interactive prefer/ignore functionality, let me know; it should now be possible to add it.
I had another look at this feature today and realized it's possible to implement it in a much more efficient and flexible way. Please try the version in
master, it should now display "Shares data with" on all paths, not just representative ones.
I tried 56585e875a2e771b8eda14b06cbaa8d82e549da7 and yes, now it displays "Shares data with" on all paths! Thank you!
Btw, now I think that it's the killer-feature of btdu! Looking for other files which share the same data in a CoW filesystem is extremely useful!
I also noticed, that now such paths are prefixed with /<SINGLE>/<DATA>, worsening the text collapse situation in tight left side panel in TUI:
Path │ % │ Shared │ Samples
──────────────────────────────┼──────┼────────────┼────────
/<SINGLE>/<DAT…t III/war3.mpq │ 100% │ ~432.7 MiB │ 778
/<SINGLE>/<DAT…t III/war3.mpq │ 100% │ ~432.7 MiB │ 778
/<SINGLE>/<DAT….wine/war3.mpq │ 100% │ ~432.7 MiB │ 778
Path │ % │ Shared │ Samples
──────────────────────────────┼──────┼────────────┼────────
/<SINGLE>/<DAT…/worldedit.exe │ 100% │ ~3.893 MiB │ 7
/<SINGLE>/<DAT…/worldedit.exe │ 100% │ ~3.893 MiB │ 7
/<SINGLE>/<DAT…/worldedit.exe │ 100% │ ~3.893 MiB │ 7
Of course, I can use Enter to watch this information individually file by file. But it seems that the width of the left side panel should have some means of manual or automatic adjustment. On a 2K monitor, the huge right side panel has 70% of its space unused.
And one question arises - is the list of "shares data with" paths always complete or can be populated further during the run (sampling or what's the correct term)?
If you still need interactive prefer/ignore functionality, let me know; it should now be possible to add it.
Well, your fix for shared paths information solved my current issue, where I abused the --prefer option to overcome it (now I understood). But I can easily imagine a situation when an easy switching between --prefer locations is important. So, I would keep the feature request, but it's not something urgent. Currently, a nice-to-have.
Generally, I think the priority and logic of represented samples paths should be illustrated in the TUI in some way. Even if it's documented, a reminder in TUI will improve usage. At least, displaying the currently preferred path... Even currently when I'm writing this I cannot remember the exact logic. I just know my specific case when I set --prefer, but I cannot say for sure what happens to files outside of the preferred path. But this is quite a crucial part of the program, maybe the essence, because it directly affects what files are displayed or not, which is not obvious for a newcomer at all.
And one question arises - is the list of "shares data with" paths always complete or can be populated further during the run (sampling or what's the correct term)?
If the file is a perfect clone of all other files it shares data with (i.e. amount of data shared is 100%), then the list will be complete from the first sample that discovers this file. Otherwise, the list may grow as more samples with partial sharing are discovered.
If the file is a perfect clone of all other files it shares data with (i.e. amount of data shared is 100%), then the list will be complete from the first sample that discovers this file. Otherwise, the list may grow as more samples with partial sharing are discovered.
Thank you for the information! It's very important to know.
Ideally, it could be noted somewhere in the interface (again, for newcomers). I do understand that it will clutter the interface, so maybe some optional newcomer interface mode can be responsible for this.
@AleXoundOS Implemented in master, please give it a go - bound to ⇧ ShiftP and ⇧ ShiftI respectively.
However, it's quite tedious to copy the path I'm interested in
8d455a00449195854382afd194b289cfbdeec0f2 also adds a key to copy the selected path to the clipboard, hopefully making such workflows for any remaining use cases easier.
text collapse situation in tight left side panel
8954ac41c1582ac50cef6c52ada8b7acb330a1ed improves the text collapsing algorithm to collapse repeated segments and highlight unique ones:
But it seems that the width of the left side panel should have some means of manual or automatic adjustment. On a 2K monitor, the huge right side panel has 70% of its space unused.
e22d2e28a6bb346a1d2b34fee077a01332253514 adjusts the layout algorithm to make the info panel scale proportionally when width is over 240 characters. Screenshots:
Let me know if you feel there's anything else to be done here; closing for now.
@AleXoundOS Implemented in
master, please give it a go - bound to ⇧ ShiftP and ⇧ ShiftI respectively.
Thank you very much! It works!