azure-functions-python-worker icon indicating copy to clipboard operation
azure-functions-python-worker copied to clipboard

Python 3.7 HTTP function intermittently fail with 400 Bad Request, receiving same valid request body as empty.

Open Xingyixzhang opened this issue 4 years ago • 2 comments

Issue

Python 3.7 HTTP function intermittently fail with 400 Bad Request "HTTP request does not contain valid JSON data", even with the same valid JSON request body.

Investigative information

  • Timestamp: 6/28/21 15:10 - 15:11 PDT (ran 20 tests with 5 failed)
  • Function App name: py37intermRBmissingRepro
  • Function name(s) (as appropriate): HttpTrigger1

Repro Demo

Check out complete source code here.

----- Hardcoded Valid JSON Request Body for all tests -----

body = {
        "name": "xxxxxxxxxxxxxxxxxx",
        "description": "xxxxxxxxxxxxxx \"AccountManager\" (773) in xCloud account \"prod_account\" (xxxx-xxxx-xxxxxx)",
        "ownerIDs": [
            "8408xxxx-xxxx-xxxxxx",
            "095bxxxx-xxxx-xxxxxx"
        ],
        "memberIDs": [
            "7a96xxxx-xxxx-xxxxxx"
        ],
        "comcast_xcloudroles": {
            "accountId": "e733xxxx-xxxx-xxxxxx",
            "roleId": 1001,
            "roleName": "AccountManager",
            "roleType": "SelfAssigned",
            "environment": "PROD"
        }
    }

----- Normally about 10 out of 50 requests fail with 400 Bad Requests -----

# edit this to point to your endpoint
url = 'https://py37intermRBmissingRepro.azurewebsites.net/api/HttpTrigger1'

total_requests = 50
number_of_400 = 0

for number in range(total_requests):
    response = requests.post(url, data=gen())
    status = response.status_code
    if status == 400:
        number_of_400 += 1

    # print(f'response: {response.status_code}')
    # print(f'body: {response.text}')

print(f"Total Requests Count: {total_requests}")
print(f"Failed Requests Count: {number_of_400}")

Failed Requests Count Demo

----- Comparing the responses of 400 and 200 requests -----

url = 'https://py37intermRBmissingRepro.azurewebsites.net/api/HttpTrigger1'

total_requests = 10
number_of_400 = 0

for number in range(total_requests):
    response = requests.post(url, data=gen())
    status = response.status_code
    if status == 400:
        number_of_400 += 1

    print(f'response: {response.status_code}')
    print(f'body: {response.text}')

print(f"Total Requests Count: {total_requests}")
print(f"Failed Requests Count: {number_of_400}")

Failed vs Successful Request Body print

----- Looking at Transfer Encoding for both successful and failed responses/ requests -----

url = 'https://py37intermRBmissingRepro.azurewebsites.net/api/HttpTrigger1'

total_requests = 10
number_of_400 = 0

for number in range(total_requests):
    # request = requests.get(url)
    response = requests.post(url, data=gen())
    status = response.status_code
    if status == 400:
        number_of_400 += 1

    print(f'Response Code: {response.status_code}')
    print(f"Transfer Encoding: {response.headers['transfer-encoding']}\n")

    # print(f'headers: {request.headers.keys()}')
    # print(f"Transfer Encoding: {request.headers['transfer-encoding']}\n")
    # print(f'body: {response.text}')

print(f"Total Requests Count: {total_requests}")
print(f"Failed Requests Count: {number_of_400}")

Transfer Encoding


Expected behavior

All executions shall be successful.

Actual behavior

About 80% are successful while the other 20% fail.

Known workarounds

No workaround found yet.

Contents of the requirements.txt file:

# DO NOT include azure-functions-worker in this file
# The Python Worker is managed by Azure Functions platform
# Manually managing azure-functions-worker may cause unexpected issues

azure-functions

Related information

----- Tested on an equivalent app implemented in PowerShell 7 for 1000 times, ALL were SUCCESSFUL-----

$body = @{
    'name' = 'xxxxxxxxxxxxxxxxxx'
    'description' = 'xxxxxxxxxxxxxx'
    "ownerID" = "8408xxxx-xxxx-xxxxxx"
    "memberID" = "7a96xxxx-xxxx-xxxxxx"
    "accountId" = "e733xxxx-xxxx-xxxxxx"
    "roleId" = 1001
    "roleName" = "AccountManager"
    "roleType" = "SelfAssigned"
    "environment" = "PROD"
}

$url = 'https://ps7intermrbmissingrepro.azurewebsites.net/api/httptrigger1'

$total_requests = 1000
$number_of_400 = 0

for ($i = 0; $i -lt $total_requests; $i++){
    $response = Invoke-WebRequest -Uri $url -Body $body -Method 'POST'

    $status_code = $response.StatusCode
    # $transfer_encoding = $response | Select-Object -Property TransferEncoding

    if ($status_code -eq 400){
        $number_of_400++
    }

    # Write-Host "Response Code: $status_code"
    # Write-Host "Transfer Encoding: $transfer_encoding"
}

Write-Host "Total Requests Count: $total_requests"
Write-Host "Failed Requests Count: $number_of_400"

PowerShell App Tests are all Successful

----- Tested on an equivalent app implemented in PowerShell 7 for 10 times, with Status Code display-----

PowerShell App Tests return 200's

----- Few more testing with Python req methods indicated same request body intermittently became empty when hitting Python Function endpoint -----

Successful Execution 200 Success with Request Body

Failed with 400: Empty Body Failed with empty body

Additional Notes

  • This issue seemed to be related to chunked encoding based on Network Trace / TCP dumps analysis.
  • 30+ tests with the same request body from Portal Test/Run seemed to be all Successful for both Python and PowerShell functions.
  • [Reported] For the same code, this issue only started happening on and after Early April 2021.

Trace Comparison

Xingyixzhang avatar Jun 28 '21 22:06 Xingyixzhang

Any update on that?

Manfrin92 avatar Sep 02 '22 11:09 Manfrin92

Hi, following up on this.

A python validator, part of the tech in my current org keeps having the same issue.

wahidstephen avatar Sep 26 '22 15:09 wahidstephen