Add logging infra to PineconeClient and Index.
The idea is, each operation has a pair of log entries - start is logged at Trace level (and should be ignored in most cases), and then we either log success at Debug level, or error at Error level. Only logging basic information like operation type and the index name - we want to avoid printing sensitive data in the logs for security reasons.
Also wired up logger(factory) to GrpcChannel and HttpClient for more detailed logging.
Fixes #118
Apparently I'm down with COVID or similar and will get to this in a couple days. Thanks!
Edit: it turned out to be way worse, but now I'm in a working condition and will review this soon.
@neon-sunset get well soon! I'm planning to take a look at ReadOnlyMemory<float> improvement as well and can do that in parallel to this work, so no rush.
@maumar I have investigated the issue with credentials leakage as discussed and could not reproduce it with gRPC stack - it is generally well-behaved if, as you noted, chatty, and does not output sensitive details about payload or metadata.
Regarding explicit logging within client and transports/index - I'm slightly on the fence about it. When looking at it without context the logging from purely gRPC and HttpClient themselves seems insufficient. At the same time, I have quite a bit of experience doing incident response and L3 support where these extra "Operation Started"+"Operation Completed/Failed" pairs have only ever added to visual noise when looking through logs and increased log storage bill - most of the time, you only ever care about innermost failure message that contains exception details, and we're bubbling up exceptions to the caller therefore this will end up being eventually observed and logged by users who care about this.
For now, I decided to try out a sample application where only "interesting" parts are logged, which is only one thing for now - parallel batch operations.
While stress-testing these I discovered that Pinecone has quite stringent payload size limits, so I want to flesh this feature out a bit more before doing a release - there's a 1k IDs limit on Delete request, so it must have parallel batching too. This is not just a nice to have but an actual issue users will and likely have already run into when working with large amounts of vectors without batching.
Regarding official/Fern's Pinecone SDK at https://github.com/pinecone-io/pinecone-dotnet-client/https://www.nuget.org/packages/Pinecone.Client - it was indeed released. However, for now it has a few issues which I believe make it worth continuing to develop Pinecone.NET:
- blocking operations where users do not expect them: https://github.com/pinecone-io/pinecone-dotnet-client/blob/main/src/Pinecone/PineconeClient.cs#L28
- array allocations (up to 12KiB per for 3072 dimensions for single array) https://github.com/pinecone-io/pinecone-dotnet-client/blob/main/src/Pinecone/Types/Vector.cs#L67 avoiding which requires more opinionated if riskier approach
- only gRPC transport is available - being able to use REST instead is still relevant to environments which restrict HTTP protocols in use, but it is also relevant to applications that want to minimize AOT binary size because using REST relies on the same code referenced code by
PineconeClient's base operations, reducing size impact by 2-2.5MiB. I have also discovered that gRPC has lower throughput than REST from my location to PoP in US when doing parallel fetch/upserts of 1-5K 1536-wide vectors - despite using gRPC, Pinecone.Client is currently not AOT-compatible
- building URL on each request by accumulating components with string concatenation: https://github.com/pinecone-io/pinecone-dotnet-client/blob/664894fa582acde97924ced857a129dd8036099d/src/Pinecone/Core/RawClient.cs#L118-L148
- there maybe others but I did not review the code further
It is, however, more customizable w.r.t. timeout, retries configuration and other gRPC settings but this can be addressed if current defaults pose issues to end users.
I think it can be merged in the current form, I have also added redact options to logging message handler so that the API key is now filtered regardless of the logging level. In any case, thank you for pointing this out previously and for contributing this PR.
Will work on remaining items I think that should get to 3.0.
Also want to investigate the gRPC performance issues some more - it seems that Pinecone's reverse-proxy is poorly configured and allows very few concurrent streams over a single H2 connection. I have partially mitigated it by adding support for client-side load balancing but it's not enough. I have also tried a toy GrpcChannel pool but it did not measurably improve performance. It might be a grpc-dotnet issue still but given that it's performance sensitive, I'm less inclined to think it's at fault and attempting to address this with writing a custom load-balancer and subchannel logic seems like an overkill for such library.
In comparison, RestTransport ends up being massively faster for big upsert/fetch/delete operations as it sidesteps the issue by using http 1.1 still, at least it does not upgrade with more or less default HttpClient configuration. Interestingly enough, as of .NET 9 RC.1, using RestTransport with NativeAOT ends up consuming less memory at load spike and idle, and of course there are binary size savings too since it reuses the same System.Text.Json code for both the control and data planes. It might be a good idea to use the 3. break opportunity to make it a default. Might be worth asking Pinecone about their http/2 issue too as it reproduces with rest client as well, if you set version 2 and connection policy as version requested or above.
@maumar The version 3.0.0 which includes this change has been released: https://github.com/neon-sunset/Pinecone.NET/releases
This release also switches to ReadOnlyMemory<float> to stop forcing Semantic Kernel to allocate large arrays whenever constructing vectors for Pinecone. The mapping in SK needs to be updated to account for this, however. The new default transport choice is Rest and if you would like to know why - I have added notes that explain the motivation to README: https://github.com/neon-sunset/Pinecone.NET?tab=readme-ov-file#rest-vs-grpc-transport
Thank you!
Update: I have looked at how connectors are implemented in Semantic Kernel and am drinking vodka.