clickhouse-java End of input stream

When doing an execute on an insert prepared statement, I get what feels like a compatability error?

java.sql.BatchUpdateException: Reached end of input stream after reading 23 of 32 bytes, server ClickHouseNode(addr=http:localhost:8123, db=ct_voz)@-1315648888
   at com.clickhouse.jdbc.SqlExceptionUtils.batchUpdateError (SqlExceptionUtils.java:90)
      com.clickhouse.jdbc.internal.SqlBasedPreparedStatement.executeAny (SqlBasedPreparedStatement.java:194)
      com.clickhouse.jdbc.internal.SqlBasedPreparedStatement.execute (SqlBasedPreparedStatement.java:395)

I'm using 0.3.2-patch7 and ClickHouse server version 21.8.15 revision 54449. The insert is a parquet one and bit tricky to reduce now but the error is in procesing the response not the request.

If I go in to the debugger, I catch the original exception:

Caused by: com.clickhouse.client.ClickHouseException: Reached end of input stream after reading 23 of 32 bytes, server ClickHouseNode(addr=http:localhost:8123, db=ct_voz)@-1315648888
	at com.clickhouse.client.ClickHouseException.of(ClickHouseException.java:113)
	at com.clickhouse.client.http.ClickHouseHttpClient.execute(ClickHouseHttpClient.java:116)
	at com.clickhouse.client.ClickHouseRequest.execute(ClickHouseRequest.java:1385)
	... 58 more
Caused by: java.io.IOException: Reached end of input stream after reading 23 of 32 bytes
	at com.clickhouse.client.ClickHouseInputStream.readBytes(ClickHouseInputStream.java:683)
	at com.clickhouse.client.data.ClickHouseLZ4InputStream.read(ClickHouseLZ4InputStream.java:207)
	at com.clickhouse.client.ClickHouseInputStream.readString(ClickHouseInputStream.java:724)
	at com.clickhouse.client.ClickHouseInputStream.readUnicodeString(ClickHouseInputStream.java:761)
	at com.clickhouse.client.data.ClickHouseRowBinaryProcessor.readColumns(ClickHouseRowBinaryProcessor.java:573)
	at com.clickhouse.client.ClickHouseDataProcessor.<init>(ClickHouseDataProcessor.java:96)
	at com.clickhouse.client.data.ClickHouseRowBinaryProcessor.<init>(ClickHouseRowBinaryProcessor.java:587)
	at com.clickhouse.client.ClickHouseDataStreamFactory.getProcessor(ClickHouseDataStreamFactory.java:47)
	at com.clickhouse.client.data.ClickHouseStreamResponse.<init>(ClickHouseStreamResponse.java:77)
	at com.clickhouse.client.data.ClickHouseStreamResponse.of(ClickHouseStreamResponse.java:54)
	at com.clickhouse.client.http.ClickHouseHttpClient.postRequest(ClickHouseHttpClient.java:90)
	at com.clickhouse.client.http.ClickHouseHttpClient.execute(ClickHouseHttpClient.java:114)

Apr 07 '22 11:04 coltnz

Hi @coltnz, did you use unwrap method to get ClickHouseRequest and then changed format to parquet for insertion? If that's the case, you'll have to deal with the response stream(ClickHouseResponse.getInputStream) manually. The driver uses RowBinary by default and has limited support for text format like TSV.

Judging from the error, it seems the driver tried to deserialize response as RowBinary but in fact it's parquet(as specified in request).

Apr 07 '22 11:04 zhicwu

No, I didn't do anything fancy, just a parquet insert. My original issue was a bad date format, which I uncovered by repeating the query with the native client. Similarly, when I run out of disk space I got a read bytes error, while the underlying issue showed in the client.

If I look at the ClickHouseRowBinaryProcessor I wonder if it isn't missed some server error checking in getColumns?

I see in https://github.com/ClickHouse/clickhouse-jdbc/blob/master/clickhouse-client/src/main/java/com/clickhouse/client/data/ClickHouseTabSeparatedProcessor.java#L132:

        String header = headerFragment.asString(true);
        if (header.startsWith("Code: ") && !header.contains("\t")) {
            input.close();
            throw new IllegalArgumentException("ClickHouse error: " + header);
        }

Whereas the RowBinary goes straight into reading column names from the input:

        int size = 0;
        try {
            size = input.readVarInt();
        } catch (EOFException e) {
            // no result returned
            return Collections.emptyList();
        }

        String[] names = new String[ClickHouseChecker.between(size, "size", 0, Integer.MAX_VALUE)];

I can actually see the server error in names[0] in the debugger.

Presumably, that would require a peak ahead on input as the TSV code does with StreamSplitter.java.

Apr 11 '22 00:04 coltnz

@coltnz, could you share code snippet and the query for reproducing the issue? I'm getting closer to release 0.3.2-patch8, so maybe I can fix the issue if it exists in new version.

Apr 11 '22 00:04 zhicwu

I can actually see the server error in names[0] in the debugger.

What's the exact error? It sounds like server responded 200 first, and then something happened resulting response only contains an error message. A similar situation was discussed in #650, it's a limitation of current http interface - switching to RowBinary format actually mitigated the impact, despite of the confusing exception :) I think gRPC and TCP interfaces are probably more reliable.

Apr 17 '22 02:04 zhicwu

Unfortunately, I failed to build the project to make a clean test case (gRPC issues) and my original case was tough to isolate being an s3 paquet read.

The case in 650 seems somewhat similar though my error occurred upfront in the header. As I noted above though this is handled in the TSV case by:

    if (header.startsWith("Code: ") && !header.contains("\t")) {
        input.close();
        throw new IllegalArgumentException("ClickHouse error: " + header);
    }

and I'm sure something similar could be achieved with the RowBinary interface and a peak ahead buffer.

Apr 18 '22 09:04 coltnz