tomcat with epollWait always return keyCount=0, but the read buffer still have data and cause many close_wait

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

tomcat with epollWait always return keyCount=0, but the read buffer still have data and cause many close_wait

Dan Zheng
1. Environment
OS: CentOS Linux release 7.8.2003 (Core)

JDK: java version "1.8.0_181"
Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)

Tomcat: tomcat-embed-core-8.5.32 with spring-boot-2.0.4-RELEASE

2. Reproduce Steps

1. request large data download api, then request 2000+ request to another lightweight api
2. the system now is Full GC, and the 2000+ request will blocked, then close all these request
3. after system throw OufOfMemoryError, the memory will be released, the cpu and memory  occupation is normal, the system is ok to visit mysql database and execute the schedule job
4. request any api, the response is always slow, and too may close_wait


3. Problem Shooting
a) I check the thread with jstack, ClientPoller in NioEndpoint, BlockPoller in NioBlockingSelector are both in

"http-nio-8043-ClientPoller-0" #83 daemon prio=5 os_prio=0 tid=0x00007faf58b50000 nid=0x4a82 runnable [0x00007faefc8d7000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000000e3c8b478> (a sun.nio.ch.Util$3)
        - locked <0x00000000e3c8b468> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000e3c8b330> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:798)
        at java.lang.Thread.run(Thread.java:748)

   Locked ownable synchronizers:
        - None
================================================================================
"NioBlockingSelector.BlockPoller-1" #72 daemon prio=5 os_prio=0 tid=0x00007faf591fa800 nid=0x4a77 runnable [0x00007faefd3e2000]
   java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        - locked <0x00000000e3c8bb78> (a sun.nio.ch.Util$3)
        - locked <0x00000000e3c8bb68> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00000000e3c8ba40> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:298)

   Locked ownable synchronizers:
================================================================================

i hook the process to see what are the two poller select(selectorTimeout) return, 

the keyCount = 0, but the read buffer have 191 byte to read, why epollWait always return keyCount = 0?

the expected behavior is, tomcat can read the data from buffer and the close the socket successfully

c) test with another method
I change the protocol to "Http11Nio2Protocol", and the close_wait will be recycled

I set java.nio.channels.spi.SelectorProvider with PollSelectorProvider, the close_wait will be recycled too
Reply | Threaded
Open this post in threaded view
|

Re: tomcat with epollWait always return keyCount=0, but the read buffer still have data and cause many close_wait

Rémy Maucherat
On Tue, Feb 23, 2021 at 4:07 AM Dan Zheng <[hidden email]> wrote:

> 1. Environment
> OS: CentOS Linux release 7.8.2003 (Core)
>
> JDK: java version "1.8.0_181"
> Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
>
> Tomcat: tomcat-embed-core-8.5.32 with spring-boot-2.0.4-RELEASE
>
> 2. Reproduce Steps
>
> 1. request large data download api, then request 2000+ request to
> another lightweight api
> 2. the system now is Full GC, and the 2000+ request will blocked, then
> close all these request
> 3. after system throw OufOfMemoryError, the memory will be released, the
> cpu and memory  occupation is normal, the system is ok to visit mysql
> database and execute the schedule job
> 4. request any api, the response is always slow, and too may close_wait
> [image: image.png]
>
> 3. Problem Shooting
> a) I check the thread with jstack, ClientPoller in NioEndpoint,
> BlockPoller in NioBlockingSelector are both in
>
> "http-nio-8043-ClientPoller-0" #83 daemon prio=5 os_prio=0
> tid=0x00007faf58b50000 nid=0x4a82 runnable [0x00007faefc8d7000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>         - locked <0x00000000e3c8b478> (a sun.nio.ch.Util$3)
>         - locked <0x00000000e3c8b468> (a
> java.util.Collections$UnmodifiableSet)
>         - locked <0x00000000e3c8b330> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>         at
> org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:798)
>         at java.lang.Thread.run(Thread.java:748)
>
>    Locked ownable synchronizers:
>         - None
>
> ================================================================================
> "NioBlockingSelector.BlockPoller-1" #72 daemon prio=5 os_prio=0
> tid=0x00007faf591fa800 nid=0x4a77 runnable [0x00007faefd3e2000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>         - locked <0x00000000e3c8bb78> (a sun.nio.ch.Util$3)
>         - locked <0x00000000e3c8bb68> (a
> java.util.Collections$UnmodifiableSet)
>         - locked <0x00000000e3c8ba40> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>         at
> org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:298)
>
>    Locked ownable synchronizers:
>
> ================================================================================
>
> b) arthas check https://github.com/alibaba/arthas
> i hook the process to see what are the two poller select(selectorTimeout)
> return,
>
> the keyCount = 0, but the read buffer have 191 byte to read, why epollWait
> always return keyCount = 0?
>
> the expected behavior is, tomcat can read the data from buffer and the
> close the socket successfully
>
> c) test with another method
> I change the protocol to "Http11Nio2Protocol", and the close_wait will be
> recycled
>
> I set java.nio.channels.spi.SelectorProvider
> with PollSelectorProvider, the close_wait will be recycled too
>

As you found out, this is a JVM bug and there are workarounds if you
experience it:
https://bz.apache.org/bugzilla/show_bug.cgi?id=63802

Rémy