FastCSV Empty fields at the end are omitted

When parsing a csv all the empty fields at the end are omitted. i.E. 123;asd;;;; (the field separator in my case is '\t') The csvRow will only have 2 elements. The list of fields should be constant, especially since in those cases it's hard to know if the csv is corrupt or if the empty data has been cut of.

Feb 23 '24 21:02 kasim-ba

Please provide the obligatory test case to proof that.

Feb 23 '24 21:02 osiegmar

Hello, recreating the behavior is quite simple. I created the following JUnit test, which shows that the first row misses the last three elements.

	@Test
	void testCsvReaderOmitsEmptyElements() {
		var data = """
one	two	three			
one two three four five	six
one						
one two								
				""";
		try(CsvReader reader = CsvReader.builder().fieldSeparator('\t').build(data)) {
			reader.forEach(row -> assertThat(row.getFieldCount()).isEqualTo(6));
		} catch (IOException e) {
		}
	}

Edit: I also created a second test that shows that the problem is connected to tab character, since the following test is working perfectly fine:

	@Test
	void testCsvReaderFieldCountCorrect() {
		var data = """
one;two;three;;;
one;two;three;four;five;six
one;;;;;
one;two;;;;
				""";
		try(CsvReader reader = CsvReader.builder().fieldSeparator(';').build(data)) {
			reader.forEach(row -> assertThat(row.getFieldCount()).isEqualTo(6));
		} catch (IOException e) {
		}
	}

Feb 26 '24 07:02 kasim-ba

Your IDE is probably simply trimming whitespaces at end of lines. Try with explicit \t.

Feb 26 '24 08:02 osiegmar

Yes, that seems to be the case, at least to some extent. I'm working currently with STS 4.21. And after adding explicitly \t the test went through. It doesn't help to add the data into a CSV file. The same problems occur. But what did work is a single line string. And now I'm wondering if it's an IDE problem or if the compiler is doing this. Do you use a different IDE (i.e. IntelliJ)?

Feb 26 '24 09:02 kasim-ba

IDEs also remove trailing whitespaces from files, if configured. It's definitely nothing the compiler does.

Maybe this helps: https://stackoverflow.com/a/2618521

Feb 26 '24 11:02 osiegmar

Well, adding those settings did not really help with my problem, but there is definitely some problem with eclipse. The other problem that I don't understand is why this problem prevails with file. So using a CSV file, gets me the same error.

Feb 26 '24 14:02 kasim-ba

If the problem prevails with a file, the file has limes trimmed. I cannot reproduce - this just works fine:

import static org.assertj.core.api.Assertions.assertThat;

import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;

import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;

import de.siegmar.fastcsv.reader.CsvReader;
import testutil.CsvRecordAssert;

public class TabTest {

    @TempDir
    private Path tempDir;

    @Test
    void test() throws IOException {
        var data = """
        one\ttwo\tthree\t\t\t
        one\ttwo\tthree\tfour\tfive\tsix
        one\t\t\t\t\t
        one\ttwo\t\t\t\t
        """;

        var tempFile = tempDir.resolve("foo.csv");

        Files.writeString(tempFile, data);

        var stream = CsvReader.builder().fieldSeparator('\t').ofCsvRecord(tempFile).stream();

        assertThat(stream).satisfiesExactly(
            rec -> CsvRecordAssert.assertThat(rec).fields()
                .hasSize(6).containsExactly("one", "two", "three", "", "", ""),
            rec -> CsvRecordAssert.assertThat(rec).fields()
                .hasSize(6).containsExactly("one", "two", "three", "four", "five", "six"),
            rec -> CsvRecordAssert.assertThat(rec).fields()
                .hasSize(6).containsExactly("one", "", "", "", "", ""),
            rec -> CsvRecordAssert.assertThat(rec).fields()
                .hasSize(6).containsExactly("one", "two", "", "", "", "")
        );
    }

}

Feb 26 '24 17:02 osiegmar

Strange eclipse. Anyway, thanks for your support.

Feb 27 '24 07:02 kasim-ba