ArrowIPCStreamIterator.SchemaBytes() returns zero bytes after a cold start

Open datbth opened this issue 2 months ago • 1 comments

Context

Given an Databricks SQL Warehouse that is currently stopped and this code:

	driverCntor, driverErr := dbsql.NewConnector(options...)
	if driverErr != nil {
		return driverErr
	}

	conn, connErr := driverCntor.Connect(ctx)
	if connErr != nil {
		return connErr
	}

	ctx := context.Background()
	rows, queryErr := conn.(driver.QueryerContext).QueryContext(ctx, "SELECT 1", []driver.NamedValue{})
	if queryErr != nil {
		return queryErr
	}
	defer rows.Close()

	ipcStreamIt, ipcStreamErr := rows.(dbsqlrows.Rows).GetArrowIPCStreams(ctx)
	if ipcStreamErr != nil {
		return ipcStreamErr
	}
	defer ipcStreamIt.Close()

	schemaBytes, schemaErr := ipcStreamIt.SchemaBytes()
	if schemaErr != nil {
		return schemaErr
	}

	if len(schemaBytes) == 0 {
		return fmt.Errorf("no schema bytes")
	}

Expected result

No error

Actual result

Error: no schema bytes

Notes

This only happens when the Databricks SQL Warehouse is stopped
When the Databricks SQL Warehouse is active, schemaBytes contains correct data

Dec 24 '25 03:12 datbth

This works when the Databricks SQL Warehouse is either stopped or active:

	streamReader, streamReaderErr := ipcStreamIt.Next()
	if streamReaderErr != nil {
		return streamReaderErr
	}
	arrowReader, arrowReaderErr := ipc.NewReader(streamReader)
	if arrowReaderErr != nil {
		return arrowReaderErr
	}
	defer arrowReader.Release()

	arrowSchema = arrowReader.Schema()

However:

It complicates the usage where the schema must be known before iterating on the Arrow RecordBatches
It does not work when there are 0 rows in the result (cases 2 and 4 below)

Summary

Case	Warehouse	Result	ArrowIPCStreamIterator.SchemaBytes()	ArrowIPCStreamIterator.Next()
1	Stopped (Cold start)	>= 1 rows	❌ 0 bytes	✅ not EOF -> Can use ipc.NewReader().ArrowSchema()
2		0 rows	❌ 0 bytes	❌ EOF
3	Active	>= 1 rows	✅ valid bytes -> Can use ipc.NewReader().ArrowSchema()	✅ not EOF -> Can use ipc.NewReader().ArrowSchema()
4		0 rows	✅ valid bytes -> Can use ipc.NewReader().ArrowSchema()	❌ EOF

I could combine ArrowIPCStreamIterator.SchemaBytes() and ArrowIPCStreamIterator.Next() to work around cases 1 and 4. But haven't found any workaround for case 2.

Dec 24 '25 04:12 datbth