SNOW-651322: Snowpark with Pandas as a Lambda layer - Library size issues
What is the current behavior?
I need to add "Snowpark with Pandas" as an AWS Lambda layer, however the size of the python lib goes above 250MB so i can not use it.
What is the desired behavior?
Please give a trim down version of the library so size of the python lib stays below 250MB
How would this improve snowflake-snowpark-python?
Please give a trim down version of the library so size of the python lib stays below 250MB
References, Other Background
Hi @kamipatel thank you for the ticket. This is something we are aware of and working on getting fixed. We do not have an ETA right now but will update you when we do.
Thank you for letting me know. I will wait for this fix as this is kind of blocker as without this I have to setup docker which is a pain. Thanks!
@sfc-gh-achandrasekaran Sorry to bother you. This is something a big bottleneck for a potential implementation using Lambda. Is there a way to provide unofficial guidance on the way to reduce the file size? All I need is to upload the data using snowpark. Thanks!
@kamipatel what are you trying to do? The connector size issues are still under active discussion so please know that we will be working on it soon. In the meantime, if you tell us what you are trying to do, we may be able to provide a workaround for you.
Hi @sfc-gh-achandrasekaran I need to create a Lambda layer which has limit of 250MB total size limit. Snowpark with pandas lib is exceeding the size limits of the AWS Lambda layer. I am looking to see there is a way to get stripped down snowpark's binary lib which is let's say 100MB or less. Thanks!
@sfc-gh-achandrasekaran If you happen to have any suggestion on my last comment. thanks!
hey @kamipatel , is it possible for you to uninstall unnecessary python packages in your env?
You could check the installed packages via pip freeze or pip list.
On my local machine, the connector package locally is taking around 47.6MB, and pandas is taking 51.8MB.
Also could you help list the packages and their size installed in your Python env? they are usually under <PythonEnv>/sites-packages
hey @kamipatel , the root cause for the package size too large is the dependency library nowflake-connector-python.
We have released a new preview version of connector with reduced sized with nanoarrow which you can check at this blog post https://medium.com/snowflake/supercharging-the-snowflake-python-connector-with-nanoarrow-8388cb57eeba
You could install the alpha connector using pip install "snowflake-connector-python[pandas]==3.1.0a2"
Do let us know your feedback. Do note this is still in preview, so we dont recommend it used for production.
Can we close this issue since the python connector downsize has been in GA for a while?
Yes please close. Thanks a lot!
On Mon, May 20, 2024 at 1:48 PM Yijun Xie @.***> wrote:
Can we close this issue since the python connector downsize has been in GA for a while?
— Reply to this email directly, view it on GitHub https://github.com/snowflakedb/snowpark-python/issues/449#issuecomment-2121010561, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHJEXMA5CTDVSDP76K4Y4LZDJAPRAVCNFSM57QLQVJKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJSGEYDCMBVGYYQ . You are receiving this because you were mentioned.Message ID: @.***>