trainbot icon indicating copy to clipboard operation
trainbot copied to clipboard

"Killed" (processing from 3MP input sample mp4)

Open natevw opened this issue 2 years ago • 5 comments

When processing a small rectangle out of a large video, trainbot seems to detect a train but then ends up "Killed." (exit code 137) inside my VM which has 3.8Gi of RAM plus 11Gi of swap added which lets it run slightly longer after starting to process a train. This is with a local sample file which it processes through ffmpeg. The source file is 2304 × 1296 pixels but the crop region is only e.g. 192 x 133.

If I pre-crop the video sample video e.g. ffmpeg -i sample-rot.mp4 -filter:v "crop=198:133:1403:284" sample-crop.mp4 then I am able to completely process the feed with trainbot. So I wonder if it is somehow trying to keep in memory not just the crop region but all the original (whole) frames too?

natevw avatar Nov 23 '23 04:11 natevw

We crop the image and then pass around only the cropped image. The original frame is thrown away. However, I have as slight suspicion for where this could be going wrong. Please check commit 03bcf15b035c385913321b2b87c3f40eaae5a28c to see if that fixes the problem.

jo-m avatar Nov 23 '23 07:11 jo-m

@jo-m Had a chance to circle back to this this evening. Using the trainbot-arm64 from the binaries archive on https://github.com/jo-m/trainbot/actions/runs/7159296241 (so the most recent https://github.com/jo-m/trainbot/commit/8afc520a5921451cf4947941b668e64b3960d6ae rather than the original, but afaict the fix should still be in that) —

unfortunately I still get Killed. on a large file when used directly.

natevw avatar Dec 13 '23 05:12 natevw

Hm, any chance you could send me an example video? Can be a link to GDrive via email or similar.

jo-m avatar Dec 13 '23 19:12 jo-m

hey hello! finally circling back to this, let's see if GitHub is happy enough with hosting the file directly?

https://github.com/jo-m/trainbot/assets/265902/8e1a64f2-7083-44a6-ab80-61395497c777

Looks like it repros with:

#! /bin/sh

export DATA_DIR=~/trains
export INPUT=~/trains/sample-rot.mp4
export RECT_X=1403
export RECT_Y=284
export RECT_W=198
export RECT_H=133
export PX_PER_M=19  # e.g. 430 px, 23 m  (dad got 20: 450px / 22.45m)
export MIN_SPEED_KPH=5
export MAX_SPEED_KPH=200
export CAMERA_FORMAT_FOURCC=YUYV

rm -rf db.sqlite3 blobs && ./trainbot

at least with the build of trainbot I (procured? built? been a while 😅) back at the time:

debian@nvw-trains-tmp:~/trains$ ll trainbot
-rwxr-xr-x 1 debian debian 28883056 Dec 13 04:56 trainbot
debian@nvw-trains-tmp:~/trains$ file trainbot
trainbot: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=4f66c09ada206ed0207289afccc38bd7cd96e1e0, for GNU/Linux 3.7.0, with debug_info, not stripped

natevw avatar May 01 '24 02:05 natevw

Still repros with the latest trainbot-arm64 from the "binaries" artifact of this latest CI run a few days ago: https://github.com/jo-m/trainbot/actions/runs/8837381153

natevw avatar May 01 '24 02:05 natevw

I tested this on my own machine, with the x86_64 binary from this CI run.

./trainbot \
  --log-pretty \
  --log-level debug \
  --input ./rot.mp4 \
  --rect-x 1403 \
  --rect-y 284 \
  --rect-w 198 \
  --rect-h 133 \
  --px-per-m 19 \
  --min-speed-kph 5 \
  --max-speed-kph 200 \
  --max-frame-count-per-seq 10000

The issue is simple - it eats up too much memory, it ate all my 16GiB or RAM and then 16GiB of swap and then crashed by OOM-killer.

So, the options would be

  • reduce frame rate, especially if you have slow but long trains
  • scale down video

jo-m avatar May 19 '24 19:05 jo-m

The issue is simple - it eats up too much memory, it ate all my 16GiB or RAM and then 16GiB of swap and then crashed by OOM-killer.

Ah, okay so iiuc it's now no longer a memory "leak" (i.e. unintentional) but just hitting a large amount of memory expected to be used now? That would make sense since my own testing is on VMs without much allocated probably shouldn't even both on the Raspberry Pi itself then which is already past its limit just providing the full-size feed itself 😂

Thanks for looking into this one too, and glad you could get this one closed out even if just something I need to avoid 👍

natevw avatar May 20 '24 16:05 natevw