Dynamic blob name for output blob
Re-raising this over here: https://github.com/Azure/Azure-Functions/issues/1471
Typically folks who massively process in files and out files might want to attach a guid as the filename - these are typically part of workflows where the guid is tracked end to end. Blob binding should allow taking in a set of infiles and producing dynamically generated outfiles in Python. I believe this is possible with C# and its only non .NET that have this limitation.
Adding my findings. Tried the following and got the error:
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": ["get", "post"]},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"type": "blob",
"direction": "out",
"name": "outputblob",
"path": "outcontainer/{name}.csv",
"connection": "AzureWebJobsStorage"
}]}
def main(req: func.HttpRequest,
name: str,
outputblob: func.Out[bytes]) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
return func.HttpResponse("This HTTP triggered function executed successfully. "
"Pass a name in the query string or in the request body for a personalized response.",
status_code=200)
[5/7/2020 7:39:57 PM] Worker failed to function id a2b6a869-bb65-48ee-a21a-957bdb0e8ebc.
[5/7/2020 7:39:57 PM] Result: Failure
[5/7/2020 7:39:57 PM] Exception: FunctionLoadError: cannot load the HttpTriggerTest function: the following parameters are declared in Python but not in function.json: {'name'}
[5/7/2020 7:39:57 PM] Stack: File "/usr/local/Cellar/azure-functions-core-tools@3/3.0.2245/workers/python/3.7/OSX/X64/azure_functions_worker/dispatcher.py", line 245, in _handle__function_load_request
[5/7/2020 7:39:57 PM] function_id, func, func_request.metadata)
[5/7/2020 7:39:57 PM] File "/usr/local/Cellar/azure-functions-core-tools@3/3.0.2245/workers/python/3.7/OSX/X64/azure_functions_worker/functions.py", line 104, in add_function
[5/7/2020 7:39:57 PM] f'the following parameters are declared in Python but '
[5/7/2020 7:39:57 PM] .
A couple of issues here. First is being able to control the output filename. I don't think the language worker protocol supports this today. Even in C#, you need to use late binding to achieve this (which is sort of not using bindings at all). @priyaananthasankar Is there anything blocking you from using the Blob SDK? If you're doing this at scale, you probably want to use the SDK anyway so you can operate on streams instead of loading everything in memory and pushing the files through gRPC and the host.
@vrdmr The parameter that uses binding data not working probably deserves a separate issue to track. Currently this doesn't work, and you're expected to deserialize the payload to access this value. I think this is possible to implement in the Python worker. This would disable some of the parameter validation at startup. I think we'd end up allowing extra parameters that are not in function.json, and the error would be raised if they are missing during invocation (or we can just pass None for the missing params).
@anthonychu : We could use the Blob SDK, but since this is supported for .NET and not for non .NET, it forms one many the many stories albeit minor, on how .NET has true first party support with functions and moves folks towards .NET. Secondly, if folks have to wire up their code with Blob SDK because of lack of metadata to express this with python functions, it just doesn't give a good perception on Python functions in general. If it is a reasonable change would be great to have it, but it definitely doesn't block us.
This would be part of a workstream to enable rich bindings. It's a big change in both the host, worker, and maybe extensions. I think we have plans to make some progress on rich bindings but @fabiocav would have more to say about this.
However, this might give you some parity with .NET, which is still working with the SDK (the binding would handle creating the client) but still give you something from the blob SDK (blob client or container client) that would still require you to work with some parts of the SDK and not a binding.
The core ask for being able to dynamically change an attribute in an output binding might not be addressed by rich bindings.
I tend to use in-between functions for this. For example I use cosmos changefeed a lot but I cannot use cosmos trigger metadata in output bindings. But what you can do is chain multiple functions. So in the first function which is changefeed triggered I read the data from the changefeed and put this in an event or queue-message which includes the desired blob name. Then with a second function for which I use the eventgrid or queue input binding where you can access the metadata and use them as parameter in the output binding. So in this second function your can access the dynamically created value to use as blobname. So then it become a multi-stage execution but you can then use blob names that are dynamically constructed during python execution.