csv icon indicating copy to clipboard operation
csv copied to clipboard

FiberError/NoMemoryError error related to csv gem occurs when upgrading from ruby ​​3.1 to 3.3

Open nguyenvd27 opened this issue 1 year ago • 8 comments

When upgrading from ruby 3.1.4 to 3.3.5, FiberError/NoMemoryError error related to csv occurred in our system. In addition, when loading csv file, CPU load also increases compared to before.

  • FiberError: can't alloc machine stack to fiber (1 x 659456 bytes): Cannot allocate memory error
  • NoMemoryError: failed to allocate memory

After investigation we found the cause. It was caused by csv gem. When upgrading ruby ​​version, csv version is also updated (from 3.2.5 to 3.2.8) Even if we try to keep ruby ​​3.1.4 and only update csv gem to 3.2.8, the FiberError/NoMemoryError error will still occur We have upgraded ruby ​​3.3.5 and kept the csv gem version 3.2.5 (same as ruby ​​version 3.1.4) and FiberError/NoMemoryError didn't occur and CPU load also didn't increase

It seems that some changes from csv 3.2.5 to 3.2.8 caused the error. This commit of csv gem is very suspicious

nguyenvd27 avatar Dec 23 '24 01:12 nguyenvd27

Could you provide a script that reproduces this?

kou avatar Dec 23 '24 02:12 kou

@kou The code in my application like this

catch(:max_lines) do
  File.open(file.path, "rt").each_line.with_index do |line, i|
    # handle line with gsub...
    csv_parse = CSV.parse(line.gsub(/\r\n?/, "\n"))
    # ...
    csv_parse.each do |row|
      # ...
      @file_content << row
    end
  end
end

In csv 3.2.8, FiberError/NoMemoryError will occur if we upload large csv file multiple times In csv 3.2.5, No problem if we upload large csv file multiple times

nguyenvd27 avatar Dec 23 '24 06:12 nguyenvd27

Thanks. Could you also provide sample data that reproduce this problem? (You don't need to use a real data. I just want to reproduce this on local.)

kou avatar Dec 23 '24 07:12 kou

BTW, CSV.parse(...) do |row| will be better than csv_parse = CSV.parse(...); csv_parse.each do |row|.

kou avatar Dec 23 '24 07:12 kou

We only use csv files with normal data with 10000 lines, sample data Screenshot 2024-12-23 at 16 15 31 or Screenshot 2024-12-23 at 16 22 30

I just want to reproduce this on local

Because the local memory is so large, we cannot reproduce it on local but can only reproduce it on the test environment (stagging, RAM: 9.6GB)

nguyenvd27 avatar Dec 23 '24 07:12 nguyenvd27

Hmm. I can't debug this without reproducible data...

kou avatar Dec 23 '24 07:12 kou

Can you provide compose.yaml (for docker compose) that can limit available memory?

kou avatar Dec 23 '24 07:12 kou

ohhh sorry. We are using Jenkins and don't have docker compose for this

nguyenvd27 avatar Dec 23 '24 08:12 nguyenvd27