Text-CSV_XS Wrong bugtracker link

In https://rt.cpan.org/Ticket/Display.html?id=166621 you queried why I used RT instead of your "designated bug tracker", by which I am guessing you meant here? I'm only guessing because the link you provided (https://rt.cpan.org/Ticket/Display.html?id=166621) actually refers to a line which says "bugtracker: http://rt.cpan.org/NoAuth/Bugs.html?Dist=Text-CSV_XS", i.e. the exact place on RT where I reported my bug.

I never looked inside META.yml when reporting the bug. I simply found the module on metacpan.org and clicked on the "Issues" link in the left-hand side panel, which took me to RT. I assume that is driven by the contents of META.yml, which appears to be wrong if RT is not your preferred tracker.

BTW, Thank you for addressing my original bug report. The problem was that I hadn't appreciated the difference between "header" and "headers". I see now that the behaviour is as documented, but it's easily misread. I did see the text "If detect_bom is given, the method "header" will be invoked ..." but I misread "header" as "headers", which I was already using anyway, so I thought I had that covered.

I still find it confusing that "headers" defaults to "auto" (not "lc"), but "header"'s "munge_column_names" option defaults to "lc": By specifying "headers => auto" I was expecting NOT to get lowercased column names, but this appears to be (silently) overridden by the default behaviour of "header".

Jun 05 '25 10:06 steve-m-hay

Thanks for this valuable feedback!

I won't change the current defaults, but I am likely to update/clarify the documentation.

lc was taken as default to match what I use with DBI/DBD's (NAME_lc). As I eat my own dogfood on a daily basis, many features are born out of daily needs

Jun 05 '25 14:06 Tux

I have to find why the bugtracker link is wrong. Likely I have it defined on too many places and it picks the wrong one

Jun 05 '25 14:06 Tux

fixed and pushed. Will be correct in next release

Jun 05 '25 19:06 Tux

@steve-m-hay does 2ed390c74c1990eb50d9e11ea5b08b3c5778dec4 address your issue?

Jun 05 '25 19:06 Tux

Unfortunately, I'm not sure that it does. Maybe I should have opened a new ticket in Github for this after all...!

You say that the "headers" attribute can be used to overrule the default value ("lc") which gets set by virtue of including the "detect_bom" attribute, but this is not entirely correct; it certainly doesn't achieve what I want. I was already specifying the "headers" attribute, passing it the value "auto". I expected the effect of that would be simply to use the first line of the source as the column names without lowercasing them or uppercasing them (since I chose "auto" rather than "lc" or "uc"), but I still got lowercase anyway, so this "headers => 'auto'" setting clearly doesn't overrule the default "lc" value. It's true that passing "headers => 'uc'" DOES overrule the default "lc" value (column names are now uppercased), but there is no value to give to "headers" that achieves the desired effect of not lowercasing or uppercasing the column names. It's actually the "munge_column_names" attribute that is needed (set to "none") to undo the lowercasing that silently kicked in.

Thinking more about this odd side-effect of including "detect_bom" also raises the question of how to overrule other defaults imposed by "header" getting invoked. We have addressed above how to overrule the default lowercasing of the column names in the array of hashes, but suppose I didn't want an array of hashes at all? If I had not been specifying "headers => 'auto'" anyway then quite apart from the sudden lowercasing of the first row, I would have been even more surprised to find my array of arrays suddenly becoming an array of hashes:

use Data::Dumper qw( Dumper );
use Text::CSV_XS qw( csv );

my $csv = "\x{FEFF}Foo,Bar\none,two\n";
utf8::encode ($csv);
print Dumper (csv (in => \$csv ));
print Dumper (csv (in => \$csv, detect_bom => 1));

outputs:

$VAR1 = [
          [
            "\x{feff}Foo",
            'Bar'
          ],
          [
            'one',
            'two'
          ]
        ];
$VAR1 = [
          {
            'foo' => 'one',
            'bar' => 'two'
          }
        ];

How would I overrule that? It appears that specifying "headers => 'skip'" (with "munge => 'none'") restores the desired behaviour, although that's entirely unexpected for me:

print Dumper (csv (in => \$csv, detect_bom => 1, headers => "skip", munge => "none"));

yields

$VAR1 = [
          [
            'Foo',
            'Bar'
          ],
          [
            'one',
            'two'
          ]
        ];

This does not sound at all like what "skip" is supposed to do: The documentation says "When skip is used, the header will not be included in the output" but the header is still present! Also, the documentation for "skip" says that it is "invalid/ignored in combinations with detect_bom", but it clearly isn't being ignored!

It gets even odder the more I poke at this. If I JUST specify "headers => 'skip'" then I get the expected behaviour of the header being skipped, but if I include "munge => 'none'" as well then the second row (the data) gets skipped instead! (I realize it's an irrelevant attribute when headers are being skipped anyway I needed it above to get the result I wanted!):

print Dumper (csv (in => \$csv, headers => "skip" ));
print Dumper (csv (in => \$csv, headers => "skip", munge => "none"));

yields:

$VAR1 = [
          [
            'one',
            'two'
          ]
        ];
$VAR1 = [
          [
            'Foo',
            'Bar'
          ]
        ];

I will stop now since it's very late. Apologies if I'm still missing anything in the documentation (I'm new to this module as of a day or two ago) and/or for seemingly opening a can of worms here!

Let me know if you would rather have separate issues to track the various problems raised above.

Jun 05 '25 23:06 steve-m-hay

^^^ this is still on my radar. be patient though

Jul 27 '25 10:07 Tux