Matthew Shipton
Matthew Shipton
@Fokko - I would be interested in your take on my interpretation of spark-connect's suitability in #821? I have no experience with spark connect, but if the objective is to...
I proposed #821 and agree with the recommendation to split them into two separate sets of requirements, one for spark connect as a method to support SQL and one for...
Looks like this is a duplicate of https://github.com/Azure/azure-sdk-for-python/issues/27980
I'm not sure it is a duplicate. #27981 seems to refer to the handling of nested symlinks which the author would like to be uploaded, whereas this is a problem...
 As another illustration of this issue, I receive a warning that my upload size is more than 100mb - when it's actually only 810kb.
Looks like this is now resolved! Thanks all.
> @tswast Out of curiosity are there any performance concerns here? Exact count distinct is already expensive, but just curious if the overhead of string encoding would show up here...
> Thanks for really digging in here, the analysis is much appreciated. I'm inclined to merge this as is after review and address performance concerns as they arise. > >...
Fine by me. I've removed the redundant array initialization in favour of a simple concat and left it at that, which itself saves a bit of time in the profiling...