[QUERY] With the new Document Intelligence SDK API, are the files streamed or fully loaded in memory?
Query/Question
I have migrated to the new documentintelligence 1.0.0 sdk. I'm using DocumentIntelligence(Async)Client method beginAnalyzeDocument. I'm using the following flavor:
PollerFlux<AnalyzeOperationDetails, AnalyzeResult> beginAnalyzeDocument(String modelId,
AnalyzeDocumentOptions analyzeDocumentOptions)
and I'm setting up the AnalyzeDocumentOptions as follows:
AnalyzeDocumentOptions docOpts = new AnalyzeDocumentOptions(BinaryData.fromStream(inputStream, contentLength));
The question is whether the SDK fully loads the document in-memory as a large bytearray before sending or streams the document to the DocIntelligence service. In a multi-user environment, fully loading into memory can be a concern and we'd like to get ahead of this before rolling to production. Additionally, if I use this flavor
public PollerFlux<BinaryData, BinaryData> beginAnalyzeDocument(String modelId, BinaryData analyzeRequest,
RequestOptions requestOptions)
is the document streamed or preloaded?
Why is this not a Bug or a feature Request? N/A
Setup (please complete the following information if applicable):
- OS: Windows/Linux
- IDE: N/A
- Library/Libraries: com.azure:azure-ai-documentintelligence:1.0.0
Thanks for trying out the new SDK and also for reaching out to us with some initial questions, @kpentaris. @samvaity will follow up with you shortly!
Hey @kpentaris, if you are using the BinaryData.fromStream(inputStream) API and if the provided InputStream is markSupported() it should be able to read data as needed and won't eagerly load it into memory.
Also, if the provided document/inputstream is a file on disk a better alternative would be to use BinaryData.fromFile() API.
Note: Before rolling to production, I would suggest profiling and tracking the JVM in load testing to note if any you see spikes of data usage and drops correlating to when requests are being sent. If you do, please let us know.
Hello @samvaity it's been a while 😂. We have finished migrating our codebase to the new Document Intelligence SDK and did some testing. Among other things, one issue we found seems to be a memory leak on the "communication layer" of the code (i.e. netty).
Due to our codebase, we are creating multiple DocumentIntelligenceClient instances (sync and async). It seems that each of those has its own Netty Clients which keep references to HttpRequest instances which themselves contain references to the BinaryData that we generate for our image in order to send it to Azure. I did some quick investigation and the memory analysis doesn't show that there are any other references to this object keeping it alive apart from the Doc Intelligence client.
There are valid reasons for having multiple clients (e.g. a multi-tenant system with multiple accounts each requiring their distinct client instance). There's also no option to close() those clients and their underlying http clients.
Below is an image of the memory tree we have seen. This is after extraction has finished successfully:
@kpentaris, From what I can see, this might just be a snapshot in time before GC comes through and reclaims the memory and doesn't confirm the leak. A good way to confirm would be to force System.gc() twice and check if the memory is still retained. If it is, then we could be looking at an actual leak. Also, could you be able to provide more details on what specific objects are staying in memory and under what conditions?
Hello @samvaity . The profiling tool forces a GC before taking a memory snapshot so this is not that. We have confirmed the leak with an OOM exception as well which shouldn't happen if the memory is reclaimable.
Hey @kpentaris, does this issue still persist?
Yes @samvaity, we tried with the new 1.0.1 SDK and there's no difference