Hybooru icon indicating copy to clipboard operation
Hybooru copied to clipboard

Rebuilding Database on large Hydrus db

Open bennyz327 opened this issue 1 year ago • 8 comments

I have a Hydrus instance with a database file that's about 90GB in size. When I use this database to start Hybooru, it gets stuck at:

Tags [####################] Done in 6:24

I waited for an hour, and when I checked the output from both HyBooru and PostgreSQL, there was no output at all.

When checking the system processes, I noticed that the CPU and memory don't seem to be in use. How can I determine whether this is an error in building the database or if I just need to keep waiting?

bennyz327 avatar Aug 13 '24 03:08 bennyz327

Post full output along with the command you use to launch hybooru.

I checked the output from both HyBooru and PostgreSQL, there was no output at all.

Well, there was at least the line you included. PostgreSQL should also at least start with "starting PostgreSQL" and other logs.

funmaker avatar Aug 14 '24 15:08 funmaker

Today give some try, here is the logs

benny@DESKTOP-E7RP7SV ~/hybooru [SIGINT]> nvim ./compose.yml
benny@DESKTOP-E7RP7SV ~/hybooru> docker compose up
[+] Running 2/0
 ✔ Container hybooru_database  Created                                                                                                            0.0s
 ✔ Container hybooru           Recreated                                                                                                          0.0s
Attaching to hybooru, hybooru_database
hybooru_database  |
hybooru_database  | PostgreSQL Database directory appears to contain a database; Skipping initialization
hybooru_database  |
hybooru_database  | 2024-11-21 16:26:22.724 UTC [1] LOG:  starting PostgreSQL 17.1 (Debian 17.1-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
hybooru_database  | 2024-11-21 16:26:22.724 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
hybooru_database  | 2024-11-21 16:26:22.724 UTC [1] LOG:  listening on IPv6 address "::", port 5432
hybooru_database  | 2024-11-21 16:26:22.727 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
hybooru_database  | 2024-11-21 16:26:22.732 UTC [29] LOG:  database system was shut down at 2024-11-21 16:24:54 UTC
hybooru_database  | 2024-11-21 16:26:22.736 UTC [1] LOG:  database system is ready to accept connections
hybooru           |
hybooru           | > [email protected] start
hybooru           | > node server.js
hybooru           |
hybooru           |
hybooru           | HyBooru started on port 80
hybooru           | Environment: production.
hybooru           | Press Ctrl-C to terminate.
hybooru           |
hybooru           | Database update detected, rebuilding...
hybooru           |
hybooru           | Rebuilding Database!
hybooru           |
hybooru           | Posts                           [####################] Done in 1:26
hybooru           | Urls                            [####################] Done in 24.56s
hybooru           | Notes                           [####################] Done in 6.09s
hybooru_database  | 2024-11-21 16:31:22.736 UTC [27] LOG:  checkpoint starting: time
hybooru_database  | 2024-11-21 16:35:52.105 UTC [27] LOG:  checkpoint complete: wrote 7149 buffers (43.6%); 0 WAL file(s) added, 0 removed, 5 recycled; write=269.331 s, sync=0.027 s, total=269.370 s; sync files=86, longest=0.008 s, average=0.001 s; distance=83635 kB, estimate=83635 kB; lsn=0/6AF6AC8,
redo lsn=0/6AF66B8
hybooru_database  | 2024-11-21 16:38:31.620 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:40:19.714 UTC [27] LOG:  checkpoint complete: wrote 11283 buffers (68.9%); 0 WAL file(s) added, 0 removed, 33 recycled; write=108.019 s, sync=0.020 s, total=108.094 s; sync files=4, longest=0.014 s, average=0.005 s; distance=529930 kB, estimate=529930 kB; lsn=0/44BF8260, redo lsn=0/27078F40
hybooru_database  | 2024-11-21 16:40:33.992 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:42:26.826 UTC [27] LOG:  checkpoint complete: wrote 11546 buffers (70.5%); 0 WAL file(s) added, 0 removed, 33 recycled; write=112.738 s, sync=0.035 s, total=112.834 s; sync files=4, longest=0.024 s, average=0.009 s; distance=540325 kB, estimate=540325 kB; lsn=0/65B96598, redo lsn=0/480223D0
hybooru_database  | 2024-11-21 16:42:38.793 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:44:37.304 UTC [27] LOG:  checkpoint complete: wrote 11321 buffers (69.1%); 0 WAL file(s) added, 0 removed, 33 recycled; write=118.448 s, sync=0.009 s, total=118.511 s; sync files=5, longest=0.004 s, average=0.002 s; distance=540687 kB, estimate=540687 kB; lsn=0/86B88FE0, redo lsn=0/69026020
hybooru_database  | 2024-11-21 16:44:51.380 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:47:03.216 UTC [27] LOG:  checkpoint complete: wrote 10993 buffers (67.1%); 0 WAL file(s) added, 0 removed, 33 recycled; write=131.750 s, sync=0.025 s, total=131.837 s; sync files=4, longest=0.018 s, average=0.007 s; distance=541174 kB, estimate=541174 kB; lsn=0/A7C37F38, redo lsn=0/8A0A39E0
hybooru_database  | 2024-11-21 16:47:20.525 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:49:21.913 UTC [27] LOG:  checkpoint complete: wrote 8999 buffers (54.9%); 0 WAL file(s) added, 0 removed, 33 recycled; write=121.326 s, sync=0.009 s, total=121.389 s; sync files=5, longest=0.005 s, average=0.002 s; distance=540254 kB, estimate=541082 kB; lsn=0/C8C8C630, redo lsn=0/AB03B268
hybooru_database  | 2024-11-21 16:49:34.726 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:51:24.828 UTC [27] LOG:  checkpoint complete: wrote 9391 buffers (57.3%); 0 WAL file(s) added, 0 removed, 33 recycled; write=110.009 s, sync=0.032 s, total=110.102 s; sync files=4, longest=0.025 s, average=0.008 s; distance=540621 kB, estimate=541036 kB; lsn=0/E9BD5488, redo lsn=0/CC02E740
hybooru_database  | 2024-11-21 16:51:34.244 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:53:54.784 UTC [27] LOG:  checkpoint complete: wrote 10215 buffers (62.3%); 0 WAL file(s) added, 0 removed, 33 recycled; write=140.456 s, sync=0.031 s, total=140.540 s; sync files=4, longest=0.027 s, average=0.008 s; distance=540602 kB, estimate=540992 kB; lsn=1/ABEE300, redo lsn=0/ED01D270
hybooru_database  | 2024-11-21 16:54:08.275 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:56:21.080 UTC [27] LOG:  checkpoint complete: wrote 8388 buffers (51.2%); 0 WAL file(s) added, 0 removed, 33 recycled; write=132.739 s, sync=0.011 s, total=132.806 s; sync files=5, longest=0.007 s, average=0.003 s; distance=540872 kB, estimate=540980 kB; lsn=1/2BC7F620, redo lsn=1/E04F3A0
hybooru_database  | 2024-11-21 16:56:39.122 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 16:58:41.331 UTC [27] LOG:  checkpoint complete: wrote 7570 buffers (46.2%); 0 WAL file(s) added, 0 removed, 33 recycled; write=122.118 s, sync=0.032 s, total=122.210 s; sync files=4, longest=0.025 s, average=0.008 s; distance=540718 kB, estimate=540954 kB; lsn=1/4CC41530, redo lsn=1/2F05AF70
hybooru_database  | 2024-11-21 16:58:53.245 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 17:00:46.247 UTC [27] LOG:  checkpoint complete: wrote 9717 buffers (59.3%); 0 WAL file(s) added, 0 removed, 33 recycled; write=112.915 s, sync=0.030 s, total=113.002 s; sync files=5, longest=0.025 s, average=0.006 s; distance=540472 kB, estimate=540906 kB; lsn=1/6DCDB998, redo lsn=1/50029188
hybooru_database  | 2024-11-21 17:00:55.491 UTC [27] LOG:  checkpoint starting: wal
hybooru_database  | 2024-11-21 17:05:21.900 UTC [27] LOG:  checkpoint complete: wrote 4866 buffers (29.7%); 0 WAL file(s) added, 0 removed, 33 recycled; write=266.338 s, sync=0.017 s, total=266.410 s; sync files=4, longest=0.012 s, average=0.005 s; distance=540897 kB, estimate=540905 kB; lsn=1/8EBA97F0, redo lsn=1/710615F0
hybooru_database  | 2024-11-21 17:05:55.930 UTC [27] LOG:  checkpoint starting: time
hybooru           | Tags                            [####################] Done in 40:06
hybooru_database  | 2024-11-21 17:10:25.066 UTC [27] LOG:  checkpoint complete: wrote 10049 buffers (61.3%); 0 WAL file(s) added, 0 removed, 32 recycled; write=269.069 s, sync=0.004 s, total=269.136 s; sync files=4, longest=0.002 s, average=0.001 s; distance=530156 kB, estimate=539830 kB; lsn=1/A0F080D8, redo lsn=1/9161C998

And now is 55:55 So there is nothing output log from 10:25 to 55:55 (about 45min) Is database building still running?

bennyz327 avatar Nov 21 '24 18:11 bennyz327

Try to run hybooru without docker, in a regular terminal. You will get more information about the current progress.

funmaker avatar Nov 22 '24 15:11 funmaker

Sorry for delay, give some try on windows PowerShell

with

  • Windows 11 Pro
  • node v18.20.7
  • postgres 17.4 (postgres run in docker)

still waiting for "Rebuilding Database", but there can see more information, hope It will works at this time

> npm start

> [email protected] start
> node server.js


HyBooru started on port http://127.0.0.1:3939
Environment: production.
Press Ctrl-C to terminate.

Database update detected, rebuilding...

Rebuilding Database!

Posts (81747/81747)             [####################] Done in 2.22s
Urls (298276/298276)            [####################] Done in 1.17s
Notes (68796/68796)             [####################] Done in 460ms
Tags (44464472/44464472)        [####################] Done in 2:02
Mappings (414515200/2535734677) [###-----------------]

bennyz327 avatar Mar 07 '25 10:03 bennyz327

Haha, It just stuck on Mappings (at the time my RAM(64GB) usage is up to 95%) Is hybooru was not design for large hydrus instance

bennyz327 avatar Mar 07 '25 11:03 bennyz327

Importing that many Mappings can take hours, Maybe 500kk/h depending on your hardware. When running hybooru from regular terminal you should see real time progress bar as in the snippet you have provided. Try leaving this import process for a day and see if it ever finishes. The reason you don't see any CPU or RAM usage is probably because the importing is limited by your disk speed. It is also possible that something might have broke at some point. Unfortunately I do not have a database this big to test this. But if you want you can upload your *.db files somewhere(images not needed) so I can test it our on my own machine. If you need a place to upload them you can use this upload-only nextcloud shared folder: https://cloud.funmaker.moe/nextcloud/s/CLWXW3QqJRZH8w7

funmaker avatar Apr 02 '25 14:04 funmaker

My upload speed at home is very slow. If possible, please use this backup from a month ago. It’s already in my cloud storage. Please let me know once you’ve finished downloading it, as I need to disable the sharing link. Thank you.

bennyz327 avatar Apr 02 '25 15:04 bennyz327

Downloaded, thanks. I will look into it in my free time.

funmaker avatar Apr 02 '25 16:04 funmaker

Ok, so I finally went around trying this out. I tried to replicate your Windows set up and it indeed seemingly hang after days of importing. I tried to import your database on a Linux server. It took 82h in total. I set tags.resolveRelations to false to skip resolving tag siblings/parents. It took 2h 30min. I removed PTR tags by setting tags.services to [9, 10]. It took 40min. So here are your options. I might use your database to try to optimize the import process for larger database.

funmaker avatar Jul 03 '25 16:07 funmaker

Skip tags from PTR is a good idea, I will try this one, thanks!

bennyz327 avatar Jul 03 '25 16:07 bennyz327

Update: I build db by setting tags.services to [9, 10]. It took 30min on my machine (win11). But the all files that show on web view is "Page not found" "404 not found", and preview is also not show (but tags/notes looks good)

How can I debug this issue?

bennyz327 avatar Jul 03 '25 18:07 bennyz327

Do you have client_files folder inside the hydrus database folder? That's where hybooru searches for images and thumbnails. Alternatively you can set posts.filesPathOverride to point to the client_files folder.

funmaker avatar Jul 03 '25 21:07 funmaker

Set filesPathOverride and all works well, thanks!

bennyz327 avatar Jul 04 '25 12:07 bennyz327