pg_net icon indicating copy to clipboard operation
pg_net copied to clipboard

Timeout for large payload

Open dthib opened this issue 1 year ago • 4 comments

Bug report

  • [X] I confirm this is a bug with Supabase, not with my own application.
  • [X] I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

When the payload is large, it does not reach the endpoint and seems stuck in pg_net until a timeout is reached. This is the case with the latest pg_net version (0.14.0) installed by default but is not the case on one production system using 0.7.x pg_net version.

To Reproduce

With latest supabase version. I have searched for a while but now we are sure that it's not related to our api because we were able to reproduce it with https://webhook.site.

  1. Go to webhookt.site and create a endpoint
  2. Execute the query below select * from net.http_post( 'https://webhook.site/11a266de-f1b6-48d9-9477-3aeb8c8098cc', '$PAYLOAD'::jsonb) by replacing $PAYLOAD with the joined json payload.

testpayload.txt

  1. Please note that the same query with a smaller payload is ok. Please not too that sending the payload directly to https://webhook.site using postman works too.

Smaller payload testpayload_works.txt

Expected behavior

Request beeing sent to the server

Screenshots

If applicable, add screenshots to help explain your problem.

System information

  • OS: macOs
  • Version of supabase postgres docker 15.8.1.020
  • pg_net installed 0.14.0

Additional context

I wanted to try other pg_net versions but I can't drop & reinstall the extension as the owner is supabase_admin.

dthib avatar Jan 14 '25 11:01 dthib

Any help ? Have you been able to reproduce the bug ? Thanks

dthib avatar Jan 25 '25 22:01 dthib

Sorry, still haven't reproduced.

Our test suite is in python, @dthib If you'd like to speed up things and help, perhaps you could contribute a test?

Also see https://github.com/supabase/pg_net/blob/master/docs/contributing.md#development.

steve-chavez avatar Jan 31 '25 15:01 steve-chavez

I can reproduce locally ✅ . Using this branch https://github.com/steve-chavez/pg_net/tree/174 and

$ nix-shell

$ net-with-nginx net-with-pg-17 psql
\set bigjson `cat test/data/big_payload.json`

select net.http_post(
            url:='http://localhost:8080/post',
            body:= :'bigjson'
        );

 http_post 
-----------
         1
(1 row)

# 127.0.0.1 - - [06/Mar/2025:18:31:03 -0500] "POST /post HTTP/1.1" 400 0 "-" "pg_net/0.14.0"

select * from net._http_response;
-[ RECORD 1 ]+----------------------------------------------------------------------------------------------------------------------------------------------------------------
id           | 1
status_code  | 
content_type | 
headers      | 
content      | 
timed_out    | 
error_msg    | Timeout of 5000 ms reached. Total time: 5000.687000 ms (DNS time: 0.077000 ms, TCP/SSL handshake time: 0.398000 ms, HTTP Request/Response time: 4999.976000 ms)
created      | 2025-03-06 18:30:58.929522-05

As seen above the request times out.

testpayload.txt

The json file is of size:

$du -sh test/data/big_payload.json 
208K    test/data/big_payload.json

Smaller payload testpayload_works.txt

The smaller json does work:

$ du -sh test/data/smaller_payload.json 
36K     test/data/smaller_payload.json
\set smallerjson `cat test/data/smaller_payload.json`

select net.http_post(
            url:='http://localhost:8080/post',
            body:= :'smallerjson'
        );
 http_post 
-----------
         1
(1 row)

# 127.0.0.1 - - [06/Mar/2025:18:38:29 -0500] "POST /post HTTP/1.1" 200 11 "-" "pg_net/0.14.0"

# select * from net._http_response;
-[ RECORD 1 ]+--------------------------------------------------------------------------------------------------------------------------------------------------------------------
id           | 1
status_code  | 200
content_type | application/json
headers      | {"Date": "Thu, 06 Mar 2025 23:38:29 GMT", "Server": "nginx/1.26.0", "Connection": "keep-alive", "Content-Type": "application/json", "Transfer-Encoding": "chunked"}
content      |                                                                                                                                                                    +
             | 
timed_out    | f
error_msg    | 
created      | 2025-03-06 18:38:29.129701-05

Although for some reason the content is empty when it should have the echoed body.

steve-chavez avatar Mar 06 '25 23:03 steve-chavez

So when doing net.http_post, we use convert_to internally:

https://github.com/supabase/pg_net/blob/6051d8aa31c66bbfc3d9c10d814af8175487f3cb/sql/pg_net.sql#L201-L208

This convert_to goes through the following path:

  • https://github.com/postgres/postgres/blob/aa5a200de62148586c588f2a7ba6ca6bdb4a51e9/src/include/catalog/pg_proc.dat#L3853-L3856

  • https://github.com/postgres/postgres/blob/aa5a200de62148586c588f2a7ba6ca6bdb4a51e9/src/backend/utils/mb/mbutils.c#L500-L518

  • https://github.com/postgres/postgres/blob/aa5a200de62148586c588f2a7ba6ca6bdb4a51e9/src/backend/utils/mb/mbutils.c#L552-L605

There's a palloc at the last function. So looks we're doing a copy of the json body, which seems unnecessary. The extra processing there also slows down the whole process, which in the end causes the timeout.


To avoid the copy, I think we're going to have to store the input as text instead of bytea for the net.http_request_queue , maybe we need to revisit https://github.com/supabase/pg_net/pull/27. It's possible we just need to add an extra column.

Additionally, converting the net.http_<method> functions to C might help too (related to https://github.com/supabase/pg_net/issues/164).

steve-chavez avatar Mar 07 '25 00:03 steve-chavez