Cannot fork after attaching in ruby >= 2.6
Example code:
require 'lxc'
old_sync = $stdout.sync
$stdout.sync = true
ct = LXC::Container.new('container')
puts "#{Process.pid} Attaching to container"
exitcode = ct.attach({wait: true}) do
puts "#{Process.pid} Inside container. Forking"
fork do
puts "#{Process.pid} Forked :)"
end
end
This used to work fine in ruby 2.5:
# ruby --version
ruby 2.5.8p224 (2020-03-31 revision 67882) [x86_64-linux]
# ruby test.rb
138201 Attaching to container
26532 Inside container. Forking
26533 Forked :)
However it seems to trigger an internal ruby error (at https://github.com/ruby/ruby/blob/510df47f5f7f83918d3aa00316c8a5b959d80d7c/thread_pthread.c#L1695) in ruby 2.6 / 2.7:
# ruby --version
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]
# ruby test.rb 2>&1 | head
138686 Attaching to container
26536 Inside container. Forking
test.rb:10: [BUG] timer_posix was not dead: 0
ruby 2.6.6p146 (2020-03-31 revision 67876) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0005 p:---- s:0021 e:000020 CFUNC :fork
c:0004 p:0035 s:0017 e:000016 BLOCK test.rb:10 [FINISH]
c:0003 p:---- s:0014 e:000013 CFUNC :attach
# ruby --version
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
# ruby test.rb 2>&1 | head
138889 Attaching to container
26538 Inside container. Forking
test.rb:10: [BUG] timer_posix was not dead: 0
ruby 2.7.1p83 (2020-03-31 revision a0c7c23c9c) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0005 p:---- s:0021 e:000020 CFUNC :fork
c:0004 p:0033 s:0017 e:000016 BLOCK test.rb:10 [FINISH]
c:0003 p:---- s:0014 e:000013 CFUNC :attach
I've tested in CentOS 7.7, Debian buster (and sid)
nice catch. I did not check and look but I would blindly suppose it's because lxc_spawn incorrectly forks a Ruby VM. Considering invalid previous state of the timer thread meaning it was not shutdown properly before. Ruby VM Process.fork has some mechanics beyond the clone() syscall that cleans up schedulers and threads data. In 2.4 it shuts down the timer before the fork.
just by replacing fork+clone3(CLONE_PARENT) with the simple call to rb_fork_ruby everything suddenly starts working. however, it requires a patch to both lxc and ruby-lxc. The downside is that the parent process either loses child or it requires to use CHILD_REAPER flag instead of the second call to clone. Anyway. The prototype works.