spring-cloud-gateway icon indicating copy to clipboard operation
spring-cloud-gateway copied to clipboard

RetryGatewayFilter causing out of direct memory errors

Open onepiers opened this issue 5 years ago • 13 comments

Describe the bug When using the gateway in combination with a backend application that accepts large multipart file uploads (tested with 250MB) we encountered the situation that the direct memory overflowed. The filter is configured to only retry GET requests but the caching mechanism intercepts the complete route and therefore also caches the multipart in the buffers. After completion the buffers are not released.

For the moment we worked around by using a dedicated GET route before all other HTTP methods.

{"@timestamp":"2020-11-06T14:46:49.835Z","@version":"1","message":"[7de5326e-4919504]  500 Server Error for HTTP POST \"/web/uploads/temp\"","logger_name":"org.springframework.boot.autoconfigure.web.reactive.error.AbstractErrorWebExceptionHandler","thread_name":"reactor-http-epoll-1","level":"ERROR","level_value":40000,"stack_trace":"reactor.netty.ReactorNetty$InternalNettyException: java.lang.OutOfMemoryError: Cannot reserve 16777216 bytes of direct buffer memory (allocated: 1057308680, limit: 1073741824)
 \tSuppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: 
 Error has been observed at the following site(s):
 \t|_ checkpoint ⇢ org.springframework.cloud.gateway.filter.WeightCalculatorWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.authorization.AuthorizationWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.authorization.ExceptionTranslationWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.savedrequest.ServerRequestCacheWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.context.SecurityContextServerWebExchangeWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.config.web.server.ServerHttpSecurity$OAuth2ResourceServerSpec$BearerTokenAuthenticationWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.authentication.AuthenticationWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.context.ReactorContextWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.header.HttpHeaderWriterWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.config.web.server.ServerHttpSecurity$ServerWebExchangeReactorContextWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.security.web.server.WebFilterChainProxy [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ org.springframework.boot.actuate.metrics.web.reactive.server.MetricsWebFilter [DefaultWebFilterChain]
 \t|_ checkpoint ⇢ HTTP POST \"/web/uploads/temp\" [ExceptionHandlingWebHandler]
 Stack trace:
 Caused by: java.lang.OutOfMemoryError: Cannot reserve 16777216 bytes of direct buffer memory (allocated: 1057308680, limit: 1073741824)
 \tat java.base/java.nio.Bits.reserveMemory(Unknown Source)
 \tat java.base/java.nio.DirectByteBuffer.<init>(Unknown Source)
 \tat java.base/java.nio.ByteBuffer.allocateDirect(Unknown Source)
 \tat io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:645)
 \tat io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:621)
 \tat io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:204)
 \tat io.netty.buffer.PoolArena.tcacheAllocateNormal(PoolArena.java:188)
 \tat io.netty.buffer.PoolArena.allocate(PoolArena.java:138)
 \tat io.netty.buffer.PoolArena.allocate(PoolArena.java:128)
 \tat io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:378)
 \tat io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
 \tat io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
 \tat io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53)
 \tat io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
 \tat io.netty.channel.epoll.EpollRecvByteAllocatorHandle.allocate(EpollRecvByteAllocatorHandle.java:75)
 \tat io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:780)
 \tat io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:475)
 \tat io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:378)
 \tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
 \tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
 \tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
 \tat java.base/java.lang.Thread.run(Unknown Source)
 "}

Sample Reproducible with a plain spring cloud gateway project (spring boot 2.3.5 / Hoxton.SR8) with a connected spring application that accepts multipart file uploads. The associated application config is the following.

server:
  port:
    8001

spring:
  application:
    name: gateway
  cloud:
    gateway:
      routes:
        - id: test
          uri: lb://test
          predicates:
            - Path=/web/**,/api/**,/webdav/**
          filters:
            - PreserveHostHeader
            - name: Retry
              args:
                retries: 3
                statuses: BAD_GATEWAY
                exceptions:
                  - java.net.ConnectException
          order: 1
      loadbalancer:
        use404: true
      x-forwarded:
        enabled: false
        proto-enabled: false

    loadbalancer:
      ribbon:
        enabled: false
eureka:
  client:
    registry-fetch-interval-seconds: 5

onepiers avatar Nov 09 '20 14:11 onepiers

There have been some improvements in reactor-netty and spring cloud. Is it possible to try with the latest versions?

spencergibb avatar Mar 31 '21 20:03 spencergibb

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 7 days this issue will be closed.

spring-cloud-issues avatar Apr 07 '21 20:04 spring-cloud-issues

I also encountered the problem when the request body was greater than 5M

FangXiaoMing2021 avatar Apr 10 '21 08:04 FangXiaoMing2021

What versions are you using?

spencergibb avatar Apr 10 '21 23:04 spencergibb

Spring boot version 2.3.5.RELEASE Spring cloud version Hoxton.SR10 Spring cloud gateway version 2.2.7.RELEASE

FangXiaoMing2021 avatar Apr 12 '21 03:04 FangXiaoMing2021

test demo

https://github.com/Fangfeikun/test-gateway.git https://github.com/Fangfeikun/test-web.git

The request stopped and the memory was not released.

FangXiaoMing2021 avatar Apr 12 '21 03:04 FangXiaoMing2021

anyone solve this problem? I also encountered the problem 。 version info : Spring boot version 2.3.12.RELEASE Spring cloud version Hoxton.SR11 Spring cloud gateway version 2.2.8.RELEASE

when I run program on idea at local enviroment , every thing is fine. but when I run program on server , this this problem is show up.

stack: Lark20210715-161318 Lark20210715-161323

zql365747776 avatar Jul 15 '21 08:07 zql365747776

Is there any solution to this problem?I also encountered this problem 。

miaozhihao001 avatar Feb 11 '22 06:02 miaozhihao001

@spencergibb anyone solve this problem?

zql365747776 avatar Apr 24 '22 13:04 zql365747776

I also encounter this problem,what should I do to solve this problem. Spring boot version 2.2.10.RELEASE Spring cloud gateway version 2.2.6.RELEASE @spencergibb

chenzhentong avatar Apr 29 '22 09:04 chenzhentong

I faced similar issue, by when can we expect a fix please.

satyadeepsingh avatar Mar 16 '24 20:03 satyadeepsingh

I'm facing this issue even using the current version: Spring Boot Version 3.2.4 Spring-cloud-starter-gateway 4.1.2

Any idea how to deal with this?

gong4soft avatar Apr 10 '24 13:04 gong4soft

Reproducible at: 'org.springframework.boot' version '3.1.8' 'springCloudVersion', "2022.0.4"

Service with config like -XX:MaxRAM=1200m -XX:MaxRAMPercentage=[40,50,60] [empty, -XX:MaxDirectMemorySize=64m, -XX:MaxDirectMemorySize=256m, -XX:MaxDirectMemorySize=512m]

will fail when:

Uploading(routing to downstream service) of 50 MB file through gateway with "java.lang.OutOfMemoryError: Cannot reserve X bytes of direct buffer memory (allocated: Y, limit: Z), it always try to use more than DirectMemorySize."

It is reproducible on just started service that did not route any uploads before so it does not look like a leak. It just try to use X times more memory than uploaded file size.

Everything works fine when I turn off

spring.cloud:
    gateway:
      default-filters:
        - name: Retry
          args:
            retries: 3
            statuses: BAD_GATEWAY, SERVICE_UNAVAILABLE, GATEWAY_TIMEOUT
            methods: GET
            backoff:
              firstBackoff: X
              maxBackoff: Y
              factor: 2
              basedOnPreviousValue: true

yaandy avatar Apr 19 '24 10:04 yaandy