parseable icon indicating copy to clipboard operation
parseable copied to clipboard

Using Arrow Flight Protocol for Querying: Missing documentation?

Open thinkORo opened this issue 1 year ago • 2 comments

Hi there,

with #769 the Arrow Flight Protocol was implemented and is published with release 1.2.

But I cannot find any (minimum) documentation how to connect to the new Arrow Flight endpoint.

What did I do?

  1. I set the environment P_FLIGHT_PORT to 8002
  2. I tried to connect to mymachine:8002 via:
  • Arrow Flight JDBC driver
  • adbc_driver_flightsql
  • pyarrow.flight

But for all tests I get the following more or less same error:

adbc_driver_flightsql NotSupportedError: NOT_IMPLEMENTED: [FlightSQL] handshake is disabled in favour of direct authentication and authorization (Unimplemented; AuthenticateBasicToken)

pyarrow.flight ArrowNotImplementedError: Flight returned unimplemented error, with message: handshake is disabled in favour of direct authentication and authorization

Could you please provide any example or documentation how to connect to Arrow Flight endpoint either via JDBC or via Python?

thinkORo avatar Jun 16 '24 07:06 thinkORo

Thanks @thinkORo we're working on the the documentation for this. We'll also release sample applications in Js and Python to use this API. Will be available soon.

Also, in the next release the Parseable console will use this API for querying, improving the UX.

nitisht avatar Jun 16 '24 07:06 nitisht

@nitisht Thanks for your quick reply. I'm looking forward checking some examples or reading the documentation.

thinkORo avatar Jun 16 '24 11:06 thinkORo

@thinkORo below is the blog where you can find the details for arrow flight in Parseable https://www.parseable.com/blog/optimize-data-transfer-with-parseable

nikhilsinhaparseable avatar Jul 25 '24 06:07 nikhilsinhaparseable

@nikhilsinhaparseable Unfortunately that doesn't work for me. Is there any specific user setting required?

By executing "reader = client.do_get(flight.Ticket(ticket_data), options=call_options)" from your documentation I get the following error:

FlightUnauthorizedError: Flight returned unauthorized error, with message: User Does not have permission to access this. gRPC client debug context: UNKNOWN:Error received from peer ipv4:0.0.0.0:8002 {grpc_message:"User Does not have permission to access this", grpc_status:7, created_time:"2024-07-29T16:39:06.974898987+02:00"}. Client context: IOError: Server never sent a data message. Detail: Internal

I used my common username and password and converted that with base64.b64encode. Could you please provide some more information if there is any user setup (permissions)?

How do you convert your username and password to use it in this example?

thinkORo avatar Jul 29 '24 14:07 thinkORo

@thinkORo the reason is - release is not done yet, you can expect the release by next week which will have this fix. Regarding the Authorization header, you need to send as Basic <Base64 encoded string of username:password>

nikhilsinhaparseable avatar Jul 29 '24 14:07 nikhilsinhaparseable

Okay, I'll check that later. And yes, regarding Authorization header: that's what I did as well. Thanks for your quick reply.

thinkORo avatar Jul 29 '24 14:07 thinkORo

@thinkORo meanwhile you can try the edge tag from our Docker images, e.g. parseable/parseable:edge - this image always points to latest main branch commit.

nitisht avatar Jul 30 '24 07:07 nitisht

Hi @nikhilsinhaparseable, sorry, I didn't find the time to check the edge-image. But now I do. It is working as expected. The request does not "feel" as if it is executed immediately (it could the underluying S3 bucket in my case as well), but I receive the requested data and can work with it. Super. Many thanks for pointing out the edge image. I really appreciate it.

thinkORo avatar Aug 03 '24 13:08 thinkORo