使用sun.net.www.protocol.http.HttpURLConnection.getInputStream()的偶发连接问题的根本问题是什么?

时间:2014-03-25 00:21:15

标签: java http tomcat request

我的问题与this, as yet unanswered, StackOverflow question非常相似,涉及神秘的连接问题。有时(仅在特定环境中的某些条件下,特别是在尝试从AWS访问某个特定URL时),http连接始终失败,没有明显的原因。

背景:

我已经能够在2个AWS EC2服务器环境中重现这一点(虽然我无法在本地重现),但仅限于尝试点击某个特定客户的Web服务URL(所有其他运行类似的URL)服务很好)。

我的Java版本:

# java -version
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

我试图点击的机器运行RESTful Web服务(在Tomcat中,可能是在Windows机器上面向Apache)。我可以curl我的代码尝试从我的代码运行的实例中获取的相同端点,并在~48-120ms内获得有效响应。从代码中,我点击了我配置的10秒超时。从两台计算机上运行netstat会显示我的服务器的以下内容(我正在进行请求FROM):

$ netstat -cowtune | grep <remote_ip>
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (0.08/2/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (0.22/3/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (1.50/4/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (0.48/4/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (4.07/5/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (3.05/5/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (2.03/5/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (1.00/5/0)
tcp        0    389 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   ESTABLISHED 501        33146      on (18446744073.69/5/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (8.20/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (7.18/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (6.15/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (5.13/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (4.11/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (3.09/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (2.07/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (1.05/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (0.03/6/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (17.46/7/0)
tcp        0    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   FIN_WAIT1   0          0          on (16.44/7/0)

tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0              0          on (15.42/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (14.39/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (13.37/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (12.35/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (11.33/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (10.31/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (9.29/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (8.27/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (7.25/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (6.23/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (5.21/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (4.19/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (3.17/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (2.15/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (1.13/7/0)
tcp        1    390 ::ffff:10.91.184.202:40153  ::ffff:<remote_ip>:<port>   CLOSING     0          0          on (0.11/7/0)

这来自远程服务器(我发出请求TO):

D:\Cygwin>netstat -ant 1 | grep 54.81.126.17
TCP    <ip_address>:<port>    54.81.126.17:40153    SYN_RECEIVED    InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    ESTABLISHED     InHost

TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost
TCP    <ip_address>:<port>    54.81.126.17:40153    FIN_WAIT_2      InHost

在我配置的10秒超时时,我的服务器会显示从ESTABLISHEDFIN_WAIT_1的转换。一段时间后,我的服务器显示从FIN_WAIT_1CLOSING的转换,同时远程服务器从ESTABLISHED转换为FIN_WAIT_2。远程Tomcat从不注册接收请求。 TShark表示:

0.000000 10.182.160.132 -> <remote_ip> TCP 74 49486 > http-alt [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=1814494 TSecr=0 WS=128
0.035580 <remote_ip> -> 10.182.160.132 TCP 70 http-alt > 49486 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1380 SACK_PERM=1 TSval=101011325 TSecr=1814494
0.035601 10.182.160.132 -> <remote_ip> TCP 66 49486 > http-alt [ACK] Seq=1 Ack=1 Win=14600 Len=0 TSval=1814503 TSecr=101011325
0.035935 10.182.160.132 -> <remote_ip> HTTP 457 POST /service/rest/security/myEndpoint HTTP/1.1
0.171137 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
0.443125 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
0.987118 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
2.079144 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
4.263141 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
8.631153 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
10.036939 10.182.160.132 -> <remote_ip> TCP 66 49486 > http-alt [FIN, ACK] Seq=392 Ack=1 Win=14600 Len=0 TSval=1817003 TSecr=101011325
10.072638 <remote_ip> -> 10.182.160.132 TCP 66 [TCP Window Update] http-alt > 49486 [ACK] Seq=1 Ack=1 Win=64296 Len=0 TSval=101012329 TSecr=1814503
17.351131 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
20.584358 <remote_ip> -> 10.182.160.132 TCP 66 http-alt > 49486 [FIN, ACK] Seq=1 Ack=1 Win=64296 Len=0 TSval=101013380 TSecr=1814503
20.584421 10.182.160.132 -> <remote_ip> TCP 66 49486 > http-alt [ACK] Seq=393 Ack=2 Win=14600 Len=0 TSval=1819640 TSecr=1

我的旧代码:

InputStream getResponseStream(String webServiceUrl) {
  URL server = new URL(webServiceUrl);
  HttpURLConnection connection = (HttpURLConnection) server.openConnection();
  connection.setDoInput(true);
  connection.setDoOutput(true);
  connection.setRequestMethod("GET");
  return connection.getInputStream(); // timeout happens here
}

我更好的代码(此及以下):

private Object getResponse(HttpURLConnection connection,
        SdRestResponseType respType) throws IOException, JAXBException,
        ProtocolException {
    InputStream is = null;
    try {
        // check if valid response
        int responseCode = connection.getResponseCode(); // timeout happens here
        if (responseCode == HttpURLConnection.HTTP_OK) {
            is = connection.getInputStream();
            switch (respType) {
            case BOOLEAN:
                return Boolean.valueOf(readInput(is));
            case STRING:
                return readInput(is);
            case XML:
                Unmarshaller unmarshaller = context.createUnmarshaller();
                return unmarshaller.unmarshal(is);
            default:
                return null;
            }
        }

        is = connection.getErrorStream();
        Unmarshaller unmarshaller = context.createUnmarshaller();
        Object response = unmarshaller.unmarshal(is);

        if (response instanceof Fault) {
            throw new SdFaultException((Fault) response);
        }
        throw new ProtocolException(connection.getResponseMessage());

    } finally {
        if (is != null) {
            is.close();
        }
    }
}

创建执行请求的HttpURLConnection对象的代码:

private HttpURLConnection getConnection(String operation, boolean xmlContent)
        throws IOException {

    URL server = new URL(baseUrl + operation);
    HttpURLConnection connection = (HttpURLConnection) server
            .openConnection();
    connection.setDoInput(true);
    connection.setDoOutput(true);
    connection.setReadTimeout(10000);
    connection.setRequestMethod(POST); // the remote endpoint accepts this request as either a GET or POST just fine, except from this code
    connection.setRequestProperty(CONTENT_TYPE, (xmlContent ? XML_ENCODED
            : URL_ENCODED));
    // set header values
    connection.addRequestProperty(CLIENT_ID, header.getClientID());
    if (header.getLocale() != null) {
        connection.addRequestProperty(LOCALE, header.getLocale());
    }
    if (header.getSessionToken() != null) {
        connection.addRequestProperty(SESSION, header.getSessionToken());
    }
    if (this.passthrough != null) {
        connection.addRequestProperty(PASSTHRU, this.passthrough);
    }

    return connection;
}

我的服务器(FROM框)在Tomcat中运行Linux,Apache和我的应用程序。所有DNS查找都没有出现任何意外情况。在所有其他方面,框之间的连接似乎正常(我没有详尽地通过我的iptables配置)。当我逐步执行代码时,一切看起来都很正常,直到执行消失到sun.net.www.protocol.http.HttpURLConnection.getInputStream()的编译代码中。

In GrepCode's OpenJDK source,第710行显示吞下IOException,但由于Oracle版本源是专有的(因此我无法在任何地方找到),我想知道是否有人知道(或者可能)指出我可能会发生什么事情,因为我还没能完全排除服务器环境中出现问题的可能性。

提前感谢任何见解!

1 个答案:

答案 0 :(得分:4)

回答我的问题?:

从不信任他们的IT员工。

经过双重/三重检查后,发现远程服务器 前面是一个活动的入侵检测系统阻止所有未知的IP地址。由于AWS实例在循环时可以更改其IP,即使它们已列入已知IP的白名单,它也只会在我的实例被退回之后才能生效。获得的经验教训:在问“你能挡住我们吗?”时会有一种令人作呕的具体情况。

为什么他们允许curl通过仍然是一个谜,直到我收到回复电子邮件以更新此答案...