Memory segmentation fault
Hi,
I experienced Memory segmentation fault in case of over-28.4GB simulation model. The root cause is that multiplication of int variables handling memory allocation exceeds int type. For example, in matrix_algebra.cpp:
234 for (int row = 0; row < n; row++ )
235 {
236 int row_offset = row*ndim;
The "row_offset" exceeds int type. Then, I modified such variables from int to long long int as below.
matrix_algebra.cpp
236 long long int row_offset = row * ndim;
288 void lu_decompose(nec_output_file& s_output, long long int n, complex_array& a, int_array& ip, long long int ndim)
305 long long int i_offset = i * ndim;
306 long long int j_offset = 0;
321 long long int r_offset = r * ndim;
336 long long int j_offset = j * ndim;
437 void solve(int n, complex_array& a, int_array& ip, complex_array& b,
438 long long int ndim)
497 void lu_decompose(nec_output_file& s_output, long long int n, complex_arrray& a_in, int_array& ip, long long int ndim)
633 void factrs(nec_output_file& s_output, long long int np, long long int nrow, complex_array& a, int_array& ip)
647 long long int mode_offset = mode * np;
667 void solves(complex_array& a, int_array& ip, complex_array& b, long long int neq,
668 long long int nrh, long long int np, long long int n, long long int mp, long long int m, long long int nop,
669 complex_array& symmetr_array)
689 long long int column_offset = ic * neq;
730 long long int ia = i + k * npeq;
c_geometry.cpp
3078 long long int jco1 = n_plus_2m;
3079 long long int jco2 = jco1 + m;
nec_context.cpp
1018 long long int iresrv = 0;
2332 void nec_context::cmset (long long int nrow, complex_arra& in_cm, nec_float rkhx)
2480 void nec_context::compute_marix_ss (int 1, int j2, int im1, int im2,
2481 complex_array& in_cm, long long int nrow, int itrp)
2583 void nec_context::cmsw (int j1, int 2, int i1, int i2,
2584 complex_array& in_cm, complex_array& cw, long long int ncw, long long int nrow, int itrp)
2696 void nec_context::cmws (int j, int i1, int i2, complex_array& in_cm,
2697 long long int nr, complex_array& cw, long long int nw, int itrp)
2699 long long int ipr, ipatch, ik, js=0, jx;
2799 void nec_context::cmww (int j, int i1, int i2, complex_array& in_cm,
2800 long long int nr, complex_array& cw, long long int nw, int itrp)
matrix_algebra.h
25 void lu_decompose (nec_output_file& s_output, long long int n, complex_array& a, int_array& ip, long long int ndim);
26 void factrs (nec_output_file& s_output, long long int np, long long int nrow, complex_array& a, int_array& ip );
27 void solve ( int n, complex_array& a, int_array& ip, complex_array& b, long long int ndim );
29 void solves (complex_array& a, int_array& ip, complex_array& b, long long int neq,
30 long long int nrh, long long int np, long long int n, long long int mp, long long int m, long long int nop,
31 complex_array& symmetry_array);
nec_context.h
818 void cmset (long long int nrow, complex_array& in_cm, nec_float rkhx);
819 void compute_matrix_ss (int j1, int j2, int im1, int im2,
820 complex_array& in_cm, long long int nrow, int itrp);
821 void cmsw (int j1, int j2, int i1, int i2, complex_array& in_cm,
822 complex_array& cw, long long int ncw, long long int nrow, int itrp);
823 void cmws (int j, int i1, int i2, complex_array& in_cm, long long int nr,
824 complex_array& cw, long long int nw, int itrp);
825 void cmww (int j, int i1, int i2, complex_array& in_cm, long long int nr,
826 complex_array& cw, long long int nw, int itrp);
After these modifications, NEC2++ can manipulate simulation models over 140GB. Thank you for this great simulation tool.
Best regards.
Yoshi Takeyasu
HI Yoshi
This is an excellent solution. I have used int64_t rather than long long int, to make it more readable. Changes committed.
Kind Regards
Tim
I have made further internal changes and added -Wconversion to the warnings. However the lapack routines use 32-bit interfaces in their interfaces, so this is going to be a problem.
I will consider going to the Eigen C++ matrix library.
The workaround at the moment is to configure --without-lapack, as the Gaussian elimination is 64-bit clean. Once the move to Eigen is complete, this will go away as a problem.
I compiled the Ver.1.7.4 without lapack, but still experienced SEG-FAULT at 37.3 GB model, though. I have not taken a look in the code yet. Thanks for your debugging.
Any chance you could compile with debugging on, and see if there is an overflow, also run with gdb?
./configure --with-bounds --without-lapack
gdb ./src/nec2++
Then type (where xxx.nec) is your input file
run -i xxx.nec -o xxx.out
It will be pretty slow, but hopefully the stacktrace on seg-fault will help.
Will do try.
I got an error as below: NEC++ Runtime Error: safe_array: array index: 1 exceeds -555984202
[Inferior 1 (process 8547) exited with code 01]
The output file is terminated as below: -------- ANTENNA ENVIRONMENT -------- FREE SPACE NEC++ Runtime Error: safe_array: array index: 1 exceeds -555984202
It seems safe_array catch negative length number "_len = -555984202" of "complex_array& cm" in nec_context.cpp.
nec_context.cpp 1040 int64_t iresrv = (m_geometry->n_plus_2m) * (m_geometry->np+3*m_geometry->mp);
These (m_geometry->n_plus_2m), (m_geometry->np) and (m_geometry->mp) are int. Then, the operation result is int also, iresrv is defined as int64_t, though. I am using gcc ver.4.7.2.
BTW, why did you change 2* to 3* ?
Thanks for the report. Changed 2 to 3 as the array was two small once patches were included (from memory). It looks like I should keep pushing the int64_t types further. I will patch, and double check.