Skip to content

Spring WebFlux and Undertow high cpu usage after perfromance test #22467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
lukidzi opened this issue Feb 19, 2019 · 5 comments
Closed

Spring WebFlux and Undertow high cpu usage after perfromance test #22467

lukidzi opened this issue Feb 19, 2019 · 5 comments
Assignees
Labels
status: duplicate A duplicate of another issue

Comments

@lukidzi
Copy link

lukidzi commented Feb 19, 2019

Spring Boot Version

  • spring-boot-starter-web:2.1.3.RELEASE
  • spring-boot-starter-undertow:2.1.3.RELEASE
  • spring-boot-starter-webflux:2.1.3.RELEASE
  • jdk 1.8.0.191

Problem

Hi,

I was testing performance of 2 microservices(genareted from start.spring.io):

  • first microservice has one endpoint which returns String. I used spring-boot-starter-web:2.1.3.RELEASE and spring-boot-starter-undertow:2.1.3.RELEASE
  • second microservice has one endpoint which consume data from first microservice though WebClient and returns Mono. I used spring-boot-starter-webflux:2.1.3.RELEASE and spring-boot-starter-undertow:2.1.3.RELEASE

After services started I ran performance test using wrk. After test finished I ran htop to check cpu usage and it was still high. Only restart of application helps.

After test cpu usage:

screenshot 2019-02-19 at 20 44 00

Also tested 2 other cases:

  • replaced WebClient with RestTemplate
  • changed server to Netty

and high cpu problem didn't occurs.

So it must be the problem with combination of WebClient and Undertow. In profiler I saw that 2 NIO threads were busy for 993ms/1000ms after test. Problem shows up only after high load.

I tried to debug through code and I found that state is REQUESTED and it looks like infinite loop which process same event.
screen

Reproduction steps

Repository to reproduce problem with steps: https://github.com/lukidzi/springboot-undertow

@wilkinsona
Copy link
Member

wilkinsona commented Feb 25, 2019

Thanks for the sample. I've reproduced the problem with very high CPU usage in ConsumerServiceApplication after the performance test has completed. During the test, the following errors were logged:

2019-02-25 12:48:32.160 ERROR 19788 --- [   XNIO-1 I/O-9] o.s.w.s.adapter.HttpWebHandlerAdapter    : [610c222a] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:34.949 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [5ff9a2b6] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:35.939 ERROR 19788 --- [   XNIO-1 I/O-3] o.s.w.s.adapter.HttpWebHandlerAdapter    : [250327a0] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:38.030 ERROR 19788 --- [   XNIO-1 I/O-8] o.s.w.s.adapter.HttpWebHandlerAdapter    : [7233f108] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:38.188 ERROR 19788 --- [  XNIO-1 I/O-12] o.s.w.s.adapter.HttpWebHandlerAdapter    : [6a0ea724] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:38.716 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [26f68b14] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:38.892 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [4db1bffd] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:39.229 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [2ab8702d] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:39.252 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [7a83df5d] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:39.255 ERROR 19788 --- [   XNIO-1 I/O-3] o.s.w.s.adapter.HttpWebHandlerAdapter    : [2925d701] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:40.676 ERROR 19788 --- [   XNIO-1 I/O-2] o.s.w.s.adapter.HttpWebHandlerAdapter    : [3704219f] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:42.328 ERROR 19788 --- [   XNIO-1 I/O-9] o.s.w.s.adapter.HttpWebHandlerAdapter    : [72c2b56] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:42.711 ERROR 19788 --- [   XNIO-1 I/O-4] o.s.w.s.adapter.HttpWebHandlerAdapter    : [d6cffd5] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:42.933 ERROR 19788 --- [   XNIO-1 I/O-3] o.s.w.s.adapter.HttpWebHandlerAdapter    : [56cbf02e] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:43.419 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [3787cd8b] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:43.511 ERROR 19788 --- [  XNIO-1 I/O-12] o.s.w.s.adapter.HttpWebHandlerAdapter    : [4d21f5ef] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:43.585 ERROR 19788 --- [   XNIO-1 I/O-9] o.s.w.s.adapter.HttpWebHandlerAdapter    : [4734fe3d] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:43.657 ERROR 19788 --- [   XNIO-1 I/O-3] o.s.w.s.adapter.HttpWebHandlerAdapter    : [7e596864] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:43.926 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [1992055d] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:44.016 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [3f2015f9] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:44.438 ERROR 19788 --- [   XNIO-1 I/O-8] o.s.w.s.adapter.HttpWebHandlerAdapter    : [1e1d21db] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:44.670 ERROR 19788 --- [   XNIO-1 I/O-5] o.s.w.s.adapter.HttpWebHandlerAdapter    : [7b1b3c] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:44.701 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [3a8239de] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:44.922 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [1eea088e] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:45.234 ERROR 19788 --- [   XNIO-1 I/O-9] o.s.w.s.adapter.HttpWebHandlerAdapter    : [2eda0ca7] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:45.403 ERROR 19788 --- [   XNIO-1 I/O-3] o.s.w.s.adapter.HttpWebHandlerAdapter    : [5b9032e6] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:45.809 ERROR 19788 --- [  XNIO-1 I/O-12] o.s.w.s.adapter.HttpWebHandlerAdapter    : [5a97543] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:45.945 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [763359e6] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:46.453 ERROR 19788 --- [  XNIO-1 I/O-12] o.s.w.s.adapter.HttpWebHandlerAdapter    : [56d36ff9] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:48.061 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [197b2b83] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:48.323 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [604b9645] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:48.960 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [2e027a8c] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:51.631 ERROR 19788 --- [  XNIO-1 I/O-10] o.s.w.s.adapter.HttpWebHandlerAdapter    : [26daf88] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:48:52.696 ERROR 19788 --- [   XNIO-1 I/O-7] o.s.w.s.adapter.HttpWebHandlerAdapter    : [b0c7fad] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:49:00.164 ERROR 19788 --- [   XNIO-1 I/O-4] o.s.w.s.adapter.HttpWebHandlerAdapter    : [13a55f3c] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)
2019-02-25 12:49:05.064 ERROR 19788 --- [   XNIO-1 I/O-2] o.s.w.s.adapter.HttpWebHandlerAdapter    : [558cfb93] Error [org.xnio.channels.FixedLengthOverflowException] for HTTP GET "/consume", but ServerHttpResponse already committed (200 OK)

Sampling in JVisualVM shows that the 12 XNIO threads are using 900ms/1000ms. Looking at a thread dump, none of them have Spring (Boot or Framework) code in their stacks. Each thread's stack looks like the following:

"XNIO-1 I/O-1" #13 prio=5 os_prio=31 tid=0x00007ff1216c9800 nid=0x5707 runnable [0x000070000185d000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method)
        at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198)
        at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:117)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000005c06b5e08> (a sun.nio.ch.Util$3)
        - locked <0x00000005c06b5df8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000005c06b5640> (a sun.nio.ch.KQueueSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:101)
        at org.xnio.nio.WorkerThread.run(WorkerThread.java:511)

   Locked ownable synchronizers:
        - None

I can't tell if this is a bug in Spring Framework, Undertow, or XNIO, but it does not look to be anything to do with Spring Boot itself. I think we should get the Framework team to take a look to see if they can narrow down the cause.

@philwebb philwebb transferred this issue from spring-projects/spring-boot Feb 25, 2019
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Feb 25, 2019
@rstoyanchev
Copy link
Contributor

The FixedLengthOverflowException is also reported under #22690.

@SillyMoo
Copy link

I have just come across this problem, digging a bit further I found that there is a loop where the onWritePossible handler in 'AbstractListenerWriteProcessor' is constantly called, the current state is 'REQUESTED'.

@colinmorelli
Copy link

colinmorelli commented Sep 5, 2019

Following up on @SillyMoo comments above, I'm seeing the same behavior. I have noticed that it does not happen when the computation of the response can happen synchronously

I.E. All else being the same, the following route will trigger this behavior:

POST("/test") {
    CompletableFuture.supplyAsync(Supplier { "test" }, executor).toMono().flatMap {
      ok().body(BodyInserters.fromObject(it))
    }
}

Whereas the following route will not cause this:

POST("/test") {
    CompletableFuture.completedFuture("test").toMono().flatMap {
      ok().body(BodyInserters.fromObject(it))
    }
}

(CompletableFuture is being used in this example as that's where I discovered the issue in my app)

It's also worth noting that this issue, in addition to causing high CPU usage, also causes the request that triggered the behavior to time out - at least in my testing.

If I override the Undertow bean to replace the Webflux handler with a custom Undertow handler, as such:

bean<UndertowReactiveWebServerFactory> {
  object: UndertowReactiveWebServerFactory() {
    override fun getWebServer(httpHandler: HttpHandler?): WebServer {
      return UndertowWebServer(
        Undertow.builder()
          .addHttpListener(8080, "localhost")
          .setHandler(Handlers.routing()
            .post("/test") {
              it.dispatch(executor, Runnable {
                it.responseSender.send("test")
              })
            }),
        true
      )
    }
  }
}

The issue does not happen. It does seem to specifically be an issue in Webflux's handling of the request.

@rstoyanchev
Copy link
Contributor

rstoyanchev commented Sep 11, 2019

I found that there is a loop where the onWritePossible handler in 'AbstractListenerWriteProcessor' is constantly called, the current state is 'REQUESTED'.

There was recent fix in AbstractWriteListenerProcessor in exactly this area, see dd22b8f. I confirmed that with Spring Framework 5.1.10.BUILD-SNAPSHOT the CPU returns to normal. I believe this issue has the same underlying cause as #23096.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: duplicate A duplicate of another issue
Projects
None yet
Development

No branches or pull requests

6 participants