batchgenerators icon indicating copy to clipboard operation
batchgenerators copied to clipboard

Suggestion of multiprocess mechanism in MultiThreadedAugmenter

Open SeanCho1996 opened this issue 3 years ago • 0 comments

Hi, I noticed that the finish procedure in MultiThreadedAumenter uses the terminate() method of Process to end the child process by sending SIGTERM. https://github.com/MIC-DKFZ/batchgenerators/blob/01f225d843992eec5467c109875accd6ea955155/batchgenerators/dataloading/multi_threaded_augmenter.py#L273-L275

In my project, my main process has a sigterm-handler set up, which was meant to stop the process via SIGTERM at the end of my training, shown as follow:

 def _sigterm_handler(_signo, _stack_frame):
        logger.warn("Terminal signal received: %s, %s" % (_signo, _stack_frame))
        stop_worker()
        exit(0)

However, the following problem occurs when working with MultiThreadedAugmenter's terminate(): when the child process is created, it forks all the methods of the main process, including my sigterm-handler, which causes MultiThreadedAugmenter's ending SIGTERM will be caught by the sigterm-handler, which will directly end my training process.

A temporary solution I came up with is to override the child process with a default signal handler signal.SIG_DFL, so that the SIGTERM of the child process does not trigger the sigterm-handler forked from main process, which means adding one line at the beginning of producer() function:

def producer(queue, data_loader, transform, thread_id, seed, abort_event, wait_time: float = 0.02):
    signal.signal(signal.SIGTERM, signal.SIG_DFL)
    ...

Is it possible that a similar operation needs to be added to the source code to avoid the impact of the child process signal on the main process?

Thank you

SeanCho1996 avatar Jul 08 '22 09:07 SeanCho1996