spreadsheet-parseexcel icon indicating copy to clipboard operation
spreadsheet-parseexcel copied to clipboard

col2int / int2col range issues

Open tlhackque opened this issue 11 months ago • 6 comments

Spreadsheet::ParseExcel::Utility has confused me. But I think it's confused.

LibreOffice FAQ says:

  • the maximum number of columns is 16384 (from column A to column XFD), in older versions 7.3 and below it was 1,024 (from column A to column AMJ);
  • the maximum number of rows is 1,048,576 (220);

OK, let's see what Spreadsheet::ParseExcel::Utility does with col2int and int2col (I encountered this in a real application):

DB<5> x col2int( 'XFD')
0  1407

Should be 16383 (16384 - 1).

 DB<4> x int2col(16383)
0  'XFD'

On the other hand:

 DB<6> use Spreadsheet::WriteExcel::Utility;
  DB<12> x xl_cell_to_rowcol('XFD$1')
0  0
1  16383
2  1
3  0
  DB<13> x xl_rowcol_to_cell(0,16383)
0  'XFD1'

So Spreadsheet::WriteExcel::Utility has it right and could be used to replace these functions (if you ignore the row). e.g.

use Spreadsheet::WriteExcel::Utility;

sub col2int {
  my $col = shift;
  my( $row, $col, $absrow, $abscol ) = xl_cell_to_rowcol( "${col}1" );
  return $col;
}

sub int2col {
  my $cofs = shift;
  my $str = xl_rowcol_to_cell( 0, $cofs );
  return substr( $str, 0, -1 );
}
1;

Of course, you can just lift the code...but this does give the right answers.

  DB<1>  x col2int( 'XFD' )
0  16383
  DB<2> x int2col(16383)
0  'XFD'

tlhackque avatar May 18 '25 17:05 tlhackque

Thanks. That looks like a bug. I'll look into fixing it.

jmcnamara avatar May 18 '25 18:05 jmcnamara

I've pushed a fix to main:

$ perl -de1 -Mlib=lib

Loading DB routines from perl5db.pl version 1.60
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(-e:1):	1
  DB<1> use Spreadsheet::ParseExcel::Utility qw( int2col col2int );

  DB<2> x col2int( 'XFD')
0  16383

You can check it out to see if you see any other issues. I have some other small changes to make after which I will roll this out in a release in the next few days.

jmcnamara avatar May 18 '25 19:05 jmcnamara

Thanks for the quick investigations and fix. A couple of notes that might be useful:

Line 1238 uses my $curr +=. It should work, but I don't like math on an undef. I think = would be a better choice; you're not really adding..

Also, back in Spreadsheet::WriteExcel::Utility:

  DB<32> x xl_cell_to_rowcol( "xfd1")
0  '-1'
1  '-1'
2  0
3  0
   DB<33> x xl_cell_to_rowcol( "XFD1")
0  0
1  16383
2  0
3  0

It appears that while ParseExcel::Utility accepts either upper or lower case input, xl_cell_to_rowcol requires uppercase.

The writeexcel functions don't validate input or provide an error; the parseexcel functions appear to. But looking at the code, it seems accidental that -1 is returned. Perhaps a real effort to validate should be made and documented... Returning an impossible subscript seems wrong.

I'm using this to validate column input - which may be helpful if you decide to validate input. (The extension for row numbers should be obvious...)

        unless( $sc =~ /^(?:[A-Z]{1,2}|[A-W][A-Z]{2}|X[A-E][A-Z]|XF[A-D])$/i ) { # Tests ordered most likely first A-XFD (1-16384)
            die( blah, blah, blah);
        }

The comment in int2col that it doesn't work for values over ZZ should be removed.

I didn't look at your tests, but int2col(col2int($x)) should =/eq $x, as should col2int(int2col($x))``. The boundary cases are either side of a carry - Z->AZ, ZZ->AAA, and the end range XFA-D...

Hope this is useful. Thanks again.

tlhackque avatar May 18 '25 20:05 tlhackque

I think = would be a better choice; you're not really adding..

Good catch, thanks. I've made that change.

The writeexcel functions don't validate input or provide an error

I don't think that it is worth the effort to change/fix that.

The comment in int2col that it doesn't work for values over ZZ should be removed.

That comment is by the original author, and while it technically isn't true it was from a time when Excel only had 255 columns (which is probably why col2int() was incorrect for 3 digit column names. So, I'll probably just leave it as it is.

jmcnamara avatar May 18 '25 22:05 jmcnamara

Up to you, of course. But errors here can result in weird behaviors much later.

I'd prefer that they die rather than return -1 or junk (by mistake). Certainly the input and returned row & column should be between the min and max returned by row_range() and col_range() when readng, and the architectural limits when writing.

I suppose I can wrap the functions to provide checks - and upcase to fix the issue with xl_cell_to_rowcol.

Comments do matter. Especially when they falsely assert that something needs to be fixed.

The comment may have been true in the past, but I spent some wasted time trying to figure out how to fix the non-issue today. I could have spent it more profitably.

Anyhow, it's your code, so your call.

tlhackque avatar May 19 '25 00:05 tlhackque

I suppose I can wrap the functions to provide checks

Yes. That is probably best. Or just create your own version of the functions based on the existing code, or not, to meet your own criteria.

jmcnamara avatar May 19 '25 04:05 jmcnamara