I'm still learning NIO and came up with a simple Ping/Pong application.
My client is written in NIO. The server is the telnet-endpoint of Memcached.
The client is constantly issuing stats\r\n
and the server responds with some statistics.
This setup is working in a non-blocking way with SocketChannel
and Selector.select()
The part that I do not fully understand is the error case where the connection between client and server get in limbo state or the server went away without closing the connection (During testing I simulate that with a iptables DROP
rule):
I end up (on OSX and Linux) with the following stacktrace on the read
operation after a successful select()
operation:
java.io.IOException: Operation timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_73]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.8.0_73]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_73]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[na:1.8.0_73]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[na:1.8.0_73]
The code in use looks something like this:
Selector selector = Selector.open();
SocketChannel memcachedClient = SocketChannel.open(MEMCACHED_HOST);
memcachedClient.configureBlocking(false);
int ops = memcachedClient.validOps();
memcachedClient.register(selector, ops, null);
while (true) {
LOGGER.info("Masterloop - Start");
//selector.select();
selector.select(1 * 1000);
LOGGER.debug("Masterloop - After select");
Set<SelectionKey> selectedKeys = selector.selectedKeys();
Iterator<SelectionKey> selectedKeysIterator = selectedKeys.iterator();
LOGGER.debug("Found Selectors {}", selectedKeys.size());
while (selectedKeysIterator.hasNext()) {
SelectionKey key = selectedKeysIterator.next();
LOGGER.info("Key: w={} r={} v={} - {}",
new Object[] { key.isWritable(), key.isReadable(), key.isValid(), key.toString() });
if (key.isReadable()) {
read(key);
} else if (key.isWritable()) {
write(key);
}
selectedKeysIterator.remove();
}
}
What is strange to me are the two observed behaviors depending on the usage of selector.select()
and selector.select(timeout)
:
Selector.select()
blocks and failsWithout a timeout, the Selector.select()
call blocks for ~50 seconds after I let all packages DROP (or unplug the network cable) and comes back with a single key:
2016-07-12 11:51:55 INFO App:88 - Masterloop - Start
2016-07-12 11:52:49 DEBUG App:92 - Masterloop - After select
2016-07-12 11:52:49 DEBUG App:96 - Found Selectors 1
2016-07-12 11:52:49 INFO App:101 - Key: w=false r=true v=true - sun.nio.ch.SelectionKeyImpl@2e817b38
and I end up with the stacktrace above in my read()
method:
private static void read(SelectionKey key) throws IOException {
SocketChannel socketChannel = (SocketChannel) key.channel();
ByteBuffer readBuffer = ByteBuffer.allocate(8192);
int numRead;
try {
numRead = socketChannel.read(readBuffer); // Exception is thrown here
} catch (IOException e) {
LOGGER.warn(e.getMessage(), e);
key.cancel();
socketChannel.close();
return;
}
// do something with readBuffer
key.interestOps(SelectionKey.OP_WRITE);
}
Selector.select(timeout)
returns but throws an IOException later, tooWith an timeout (i.e. 1000 ms), the Selector.select(1000)
call returns after I let all packages DROP (or unplug the network cable) and with zero keys. That happens constantly as my while
-loop continues to run. Again after ~45 seconds the then-active loop iteration returns a single, readable key that ends again with an IOException on read()
2016-07-12 11:54:24 INFO App:113 - ================
2016-07-12 11:54:24 INFO App:88 - Masterloop - Start
2016-07-12 11:54:25 DEBUG App:92 - Masterloop - After select
2016-07-12 11:54:25 DEBUG App:96 - Found Selectors 0
2016-07-12 11:54:25 INFO App:113 - ================
2016-07-12 11:54:25 INFO App:88 - Masterloop - Start
2016-07-12 11:54:26 DEBUG App:92 - Masterloop - After select
2016-07-12 11:54:26 DEBUG App:96 - Found Selectors 0
2016-07-12 11:54:26 INFO App:113 - ================
...
2016-07-12 11:55:16 INFO App:113 - ================
2016-07-12 11:55:16 INFO App:88 - Masterloop - Start
2016-07-12 11:55:17 DEBUG App:92 - Masterloop - After select
2016-07-12 11:55:17 DEBUG App:96 - Found Selectors 0
2016-07-12 11:55:17 INFO App:113 - ================
2016-07-12 11:55:17 INFO App:88 - Masterloop - Start
2016-07-12 11:55:18 DEBUG App:92 - Masterloop - After select
2016-07-12 11:55:18 DEBUG App:96 - Found Selectors 1
2016-07-12 11:55:18 INFO App:101 - Key: w=false r=true v=true - sun.nio.ch.SelectionKeyImpl@2e817b38
2016-07-12 11:55:18 WARN App:30 - Operation timed out
java.io.IOException: Operation timed out
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_73]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.8.0_73]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_73]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[na:1.8.0_73]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[na:1.8.0_73]
I somewhat understand the behavior in general: It is impossible to judge if the server went away or the connection is still active without actively probing it.
Looking at Wireshark I see a bunch of resubmission (which makes sense as the server never ACKed my client's request) and after the 8th resubmission the IOException
happens.
But how can I influence this timeout? Is that only on the TCP level? If so which settings (kernel?) are these?
I played with some settings but they seem to work only together with KEEP_ALIVE
. Can't I get a feedback from NIO earlier (say 10 seconds) that something went wrong?
Thanks!
答案 0 :(得分:0)
read()
immediately, not after a timeout.