python-bsonjs icon indicating copy to clipboard operation
python-bsonjs copied to clipboard

Help using aggregate_raw_results

Open martyzz1 opened this issue 1 year ago • 2 comments

I've been going around in circles a bit trying to understand if this library can be used to speed up decoding an aggregation query.

Or whether after recent pymongo updates its needed at all.

documents = []

cursor = collection.aggregate_raw_batches(
                              pipeline=aggregation_query,
)
while True:
    try:
        documents.extend([x for x in decode_all(cursor.next())])
    except StopIteration:
        break

How would I use bsonjs.dumps instead?

martyzz1 avatar Feb 13 '24 17:02 martyzz1

This library is only useful for converting raw BSON data (eg RawBSONDocument) to MongoDB Extended JSON. If you need the documents to be decoded into Python dict then this library will not help.

Also aggregate_raw_batches is only useful when the app needs a stream of raw BSON data. If you're going to decode_all then it will be more efficient to use a regular aggregate:

documents = list(collection.aggregate(pipeline))

ShaneHarvey avatar Feb 13 '24 18:02 ShaneHarvey

For help speeding up your application I suggest posting here: https://www.mongodb.com/community/forums/tag/python

It would help to include more info about the size of the result set (how big is documents?), how long is the query vs the query decoding, what happens to documents, would it be faster to process the documents individually?, etc.

ShaneHarvey avatar Feb 13 '24 19:02 ShaneHarvey