Empty fields at the end are omitted
When parsing a csv all the empty fields at the end are omitted. i.E. 123;asd;;;; (the field separator in my case is '\t') The csvRow will only have 2 elements. The list of fields should be constant, especially since in those cases it's hard to know if the csv is corrupt or if the empty data has been cut of.
Please provide the obligatory test case to proof that.
Hello, recreating the behavior is quite simple. I created the following JUnit test, which shows that the first row misses the last three elements.
@Test
void testCsvReaderOmitsEmptyElements() {
var data = """
one two three
one two three four five six
one
one two
""";
try(CsvReader reader = CsvReader.builder().fieldSeparator('\t').build(data)) {
reader.forEach(row -> assertThat(row.getFieldCount()).isEqualTo(6));
} catch (IOException e) {
}
}
Edit: I also created a second test that shows that the problem is connected to tab character, since the following test is working perfectly fine:
@Test
void testCsvReaderFieldCountCorrect() {
var data = """
one;two;three;;;
one;two;three;four;five;six
one;;;;;
one;two;;;;
""";
try(CsvReader reader = CsvReader.builder().fieldSeparator(';').build(data)) {
reader.forEach(row -> assertThat(row.getFieldCount()).isEqualTo(6));
} catch (IOException e) {
}
}
Your IDE is probably simply trimming whitespaces at end of lines. Try with explicit \t.
Yes, that seems to be the case, at least to some extent. I'm working currently with STS 4.21. And after adding explicitly \t the test went through.
It doesn't help to add the data into a CSV file. The same problems occur.
But what did work is a single line string.
And now I'm wondering if it's an IDE problem or if the compiler is doing this.
Do you use a different IDE (i.e. IntelliJ)?
IDEs also remove trailing whitespaces from files, if configured. It's definitely nothing the compiler does.
Maybe this helps: https://stackoverflow.com/a/2618521
Well, adding those settings did not really help with my problem, but there is definitely some problem with eclipse. The other problem that I don't understand is why this problem prevails with file. So using a CSV file, gets me the same error.
If the problem prevails with a file, the file has limes trimmed. I cannot reproduce - this just works fine:
import static org.assertj.core.api.Assertions.assertThat;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.io.TempDir;
import de.siegmar.fastcsv.reader.CsvReader;
import testutil.CsvRecordAssert;
public class TabTest {
@TempDir
private Path tempDir;
@Test
void test() throws IOException {
var data = """
one\ttwo\tthree\t\t\t
one\ttwo\tthree\tfour\tfive\tsix
one\t\t\t\t\t
one\ttwo\t\t\t\t
""";
var tempFile = tempDir.resolve("foo.csv");
Files.writeString(tempFile, data);
var stream = CsvReader.builder().fieldSeparator('\t').ofCsvRecord(tempFile).stream();
assertThat(stream).satisfiesExactly(
rec -> CsvRecordAssert.assertThat(rec).fields()
.hasSize(6).containsExactly("one", "two", "three", "", "", ""),
rec -> CsvRecordAssert.assertThat(rec).fields()
.hasSize(6).containsExactly("one", "two", "three", "four", "five", "six"),
rec -> CsvRecordAssert.assertThat(rec).fields()
.hasSize(6).containsExactly("one", "", "", "", "", ""),
rec -> CsvRecordAssert.assertThat(rec).fields()
.hasSize(6).containsExactly("one", "two", "", "", "", "")
);
}
}
Strange eclipse. Anyway, thanks for your support.