org.apache.tomcat.jni.TestSocketServer timeout test failures (was: VOTE 9.0.27)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

org.apache.tomcat.jni.TestSocketServer timeout test failures (was: VOTE 9.0.27)

Konstantin Kolinko
Hi!

Splitting from "VOTE 9.0.27" thread.

вт, 8 окт. 2019 г. в 21:00, Mark Thomas <[hidden email]>:

>
> On 08/10/2019 17:49, Mark Thomas wrote:
> > On 07/10/2019 17:40, Igal Sapir wrote:
> >
> > <snip/>
> >
> >> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt:   FAILED
> >> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-Socket.timeoutSet
> >> failed (<1s) [999760800] +-[400]
> >> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-junit.framework.AssertionFailedError:
> >> Socket.timeoutSet failed (<1s) [999760800] +-[400]
> >> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-   at
> >> org.apache.tomcat.jni.TestSocketServer.testBlockingReadFromClientWithTimeout(TestSocketServer.java:111)
> >>
> >> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-
> >> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-Testcase: testPort
> >> took 0.001 sec
> >
> > That is a false positive. The timing variance is significantly greater
> > than the error margin. It looks like we need to adjust that. Interesting
> > that the socket waited a little less than a second to timeout. We'd need
> > to increase the error margin by 3 orders of magnitude (400ns to 0.4ms)
> > for that test to pass.
> >
> > I'll start looking into these but more eyes are always welcome.

Running on Windows 10, AdoptOpenJDK Java 8u222, I also observe this
test failing (and saw it for similar runs for older release as well).

In my case the numbers are:

Testcase: testBlockingReadFromClientWithTimeout took 1,032 sec
    FAILED
Socket.timeoutSet failed (<1s) [998666400] +-[200]
junit.framework.AssertionFailedError: Socket.timeoutSet failed (<1s)
[998666400] +-[200]
    at org.apache.tomcat.jni.TestSocketServer.testBlockingReadFromClientWithTimeout(TestSocketServer.java:111)

Socket.timeoutSet failed (<1s) [987526600] +-[200]
Socket.timeoutSet failed (<1s) [991705200] +-[200]

Digging into this, I see that
- Socket.timeoutSet is essentially a direct wrapper around
apr_socket_timeout_set
- Socket.recv is essentially a direct wrapper around apr_socket_recv
and those APR methods a wrappers around Windows Socket API
(setsockopt for setting timeout, WSARecv for reading)

https://svn.apache.org/viewvc/apr/apr/branches/1.6.x/network_io/win32/
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-setsockopt
https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsarecv

I think that [998666400] for practical purposes is the same as 1s (as
far as we are verifying our own API), and we cannot improve how
precise is socket timeout handling in Windows.

Thus let's increase the error margin.

BTW,
setsockopt method in Windows accepts a value in milliseconds. I do not
know what clock it uses to measure the timeout. In my example of
[987526600] the difference from expected value is 14 msecs. It seems
similar to the granularity of System.currentTimeMillis().

How about a margin of 100 msec?

Best regards,
Konstantin Kolinko

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: org.apache.tomcat.jni.TestSocketServer timeout test failures (was: VOTE 9.0.27)

markt
On 09/10/2019 00:51, Konstantin Kolinko wrote:

> Hi!
>
> Splitting from "VOTE 9.0.27" thread.
>
> вт, 8 окт. 2019 г. в 21:00, Mark Thomas <[hidden email]>:
>>
>> On 08/10/2019 17:49, Mark Thomas wrote:
>>> On 07/10/2019 17:40, Igal Sapir wrote:
>>>
>>> <snip/>
>>>
>>>> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt:   FAILED
>>>> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-Socket.timeoutSet
>>>> failed (<1s) [999760800] +-[400]
>>>> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-junit.framework.AssertionFailedError:
>>>> Socket.timeoutSet failed (<1s) [999760800] +-[400]
>>>> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-   at
>>>> org.apache.tomcat.jni.TestSocketServer.testBlockingReadFromClientWithTimeout(TestSocketServer.java:111)
>>>>
>>>> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-
>>>> TEST-org.apache.tomcat.jni.TestSocketServer.NIO2.txt-Testcase: testPort
>>>> took 0.001 sec
>>>
>>> That is a false positive. The timing variance is significantly greater
>>> than the error margin. It looks like we need to adjust that. Interesting
>>> that the socket waited a little less than a second to timeout. We'd need
>>> to increase the error margin by 3 orders of magnitude (400ns to 0.4ms)
>>> for that test to pass.
>>>
>>> I'll start looking into these but more eyes are always welcome.
>
> Running on Windows 10, AdoptOpenJDK Java 8u222, I also observe this
> test failing (and saw it for similar runs for older release as well).
>
> In my case the numbers are:
>
> Testcase: testBlockingReadFromClientWithTimeout took 1,032 sec
>     FAILED
> Socket.timeoutSet failed (<1s) [998666400] +-[200]
> junit.framework.AssertionFailedError: Socket.timeoutSet failed (<1s)
> [998666400] +-[200]
>     at org.apache.tomcat.jni.TestSocketServer.testBlockingReadFromClientWithTimeout(TestSocketServer.java:111)
>
> Socket.timeoutSet failed (<1s) [987526600] +-[200]
> Socket.timeoutSet failed (<1s) [991705200] +-[200]
>
> Digging into this, I see that
> - Socket.timeoutSet is essentially a direct wrapper around
> apr_socket_timeout_set
> - Socket.recv is essentially a direct wrapper around apr_socket_recv
> and those APR methods a wrappers around Windows Socket API
> (setsockopt for setting timeout, WSARecv for reading)
>
> https://svn.apache.org/viewvc/apr/apr/branches/1.6.x/network_io/win32/
> https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-setsockopt
> https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsarecv
>
> I think that [998666400] for practical purposes is the same as 1s (as
> far as we are verifying our own API), and we cannot improve how
> precise is socket timeout handling in Windows.
>
> Thus let's increase the error margin.
>
> BTW,
> setsockopt method in Windows accepts a value in milliseconds. I do not
> know what clock it uses to measure the timeout. In my example of
> [987526600] the difference from expected value is 14 msecs. It seems
> similar to the granularity of System.currentTimeMillis().
>
> How about a margin of 100 msec?

When I wrote the test I made the (clearly incorrect) assumption that the
resolution of System.nanoTime() was going to be the same as the
resolution of the mechanism that manages socket timeouts.

Given your research above, 100ms seems perfectly reasonable to me.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]