我的问题与this, as yet unanswered, StackOverflow question非常相似,涉及神秘的连接问题。有时(仅在特定环境中的某些条件下,特别是在尝试从AWS访问某个特定URL时),http连接始终失败,没有明显的原因。
背景:
我已经能够在2个AWS EC2服务器环境中重现这一点(虽然我无法在本地重现),但仅限于尝试点击某个特定客户的Web服务URL(所有其他运行类似的URL)服务很好)。
我的Java版本:
# java -version
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)
我试图点击的机器运行RESTful Web服务(在Tomcat中,可能是在Windows机器上面向Apache)。我可以curl
我的代码尝试从我的代码运行的实例中获取的相同端点,并在~48-120ms内获得有效响应。从代码中,我点击了我配置的10秒超时。从两台计算机上运行netstat
会显示我的服务器的以下内容(我正在进行请求FROM):
$ netstat -cowtune | grep <remote_ip>
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (0.08/2/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (0.22/3/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (1.50/4/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (0.48/4/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (4.07/5/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (3.05/5/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (2.03/5/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (1.00/5/0)
tcp 0 389 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> ESTABLISHED 501 33146 on (18446744073.69/5/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (8.20/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (7.18/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (6.15/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (5.13/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (4.11/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (3.09/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (2.07/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (1.05/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (0.03/6/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (17.46/7/0)
tcp 0 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> FIN_WAIT1 0 0 on (16.44/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (15.42/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (14.39/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (13.37/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (12.35/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (11.33/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (10.31/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (9.29/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (8.27/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (7.25/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (6.23/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (5.21/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (4.19/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (3.17/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (2.15/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (1.13/7/0)
tcp 1 390 ::ffff:10.91.184.202:40153 ::ffff:<remote_ip>:<port> CLOSING 0 0 on (0.11/7/0)
这来自远程服务器(我发出请求TO):
D:\Cygwin>netstat -ant 1 | grep 54.81.126.17
TCP <ip_address>:<port> 54.81.126.17:40153 SYN_RECEIVED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 ESTABLISHED InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
TCP <ip_address>:<port> 54.81.126.17:40153 FIN_WAIT_2 InHost
在我配置的10秒超时时,我的服务器会显示从ESTABLISHED
到FIN_WAIT_1
的转换。一段时间后,我的服务器显示从FIN_WAIT_1
到CLOSING
的转换,同时远程服务器从ESTABLISHED
转换为FIN_WAIT_2
。远程Tomcat从不注册接收请求。 TShark表示:
0.000000 10.182.160.132 -> <remote_ip> TCP 74 49486 > http-alt [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSval=1814494 TSecr=0 WS=128
0.035580 <remote_ip> -> 10.182.160.132 TCP 70 http-alt > 49486 [SYN, ACK] Seq=0 Ack=1 Win=8192 Len=0 MSS=1380 SACK_PERM=1 TSval=101011325 TSecr=1814494
0.035601 10.182.160.132 -> <remote_ip> TCP 66 49486 > http-alt [ACK] Seq=1 Ack=1 Win=14600 Len=0 TSval=1814503 TSecr=101011325
0.035935 10.182.160.132 -> <remote_ip> HTTP 457 POST /service/rest/security/myEndpoint HTTP/1.1
0.171137 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
0.443125 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
0.987118 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
2.079144 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
4.263141 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
8.631153 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
10.036939 10.182.160.132 -> <remote_ip> TCP 66 49486 > http-alt [FIN, ACK] Seq=392 Ack=1 Win=14600 Len=0 TSval=1817003 TSecr=101011325
10.072638 <remote_ip> -> 10.182.160.132 TCP 66 [TCP Window Update] http-alt > 49486 [ACK] Seq=1 Ack=1 Win=64296 Len=0 TSval=101012329 TSecr=1814503
17.351131 10.182.160.132 -> <remote_ip> HTTP 457 [TCP Retransmission] POST /service/rest/security/myEndpoint HTTP/1.1
20.584358 <remote_ip> -> 10.182.160.132 TCP 66 http-alt > 49486 [FIN, ACK] Seq=1 Ack=1 Win=64296 Len=0 TSval=101013380 TSecr=1814503
20.584421 10.182.160.132 -> <remote_ip> TCP 66 49486 > http-alt [ACK] Seq=393 Ack=2 Win=14600 Len=0 TSval=1819640 TSecr=1
我的旧代码:
InputStream getResponseStream(String webServiceUrl) {
URL server = new URL(webServiceUrl);
HttpURLConnection connection = (HttpURLConnection) server.openConnection();
connection.setDoInput(true);
connection.setDoOutput(true);
connection.setRequestMethod("GET");
return connection.getInputStream(); // timeout happens here
}
我更好的代码(此及以下):
private Object getResponse(HttpURLConnection connection,
SdRestResponseType respType) throws IOException, JAXBException,
ProtocolException {
InputStream is = null;
try {
// check if valid response
int responseCode = connection.getResponseCode(); // timeout happens here
if (responseCode == HttpURLConnection.HTTP_OK) {
is = connection.getInputStream();
switch (respType) {
case BOOLEAN:
return Boolean.valueOf(readInput(is));
case STRING:
return readInput(is);
case XML:
Unmarshaller unmarshaller = context.createUnmarshaller();
return unmarshaller.unmarshal(is);
default:
return null;
}
}
is = connection.getErrorStream();
Unmarshaller unmarshaller = context.createUnmarshaller();
Object response = unmarshaller.unmarshal(is);
if (response instanceof Fault) {
throw new SdFaultException((Fault) response);
}
throw new ProtocolException(connection.getResponseMessage());
} finally {
if (is != null) {
is.close();
}
}
}
创建执行请求的HttpURLConnection对象的代码:
private HttpURLConnection getConnection(String operation, boolean xmlContent)
throws IOException {
URL server = new URL(baseUrl + operation);
HttpURLConnection connection = (HttpURLConnection) server
.openConnection();
connection.setDoInput(true);
connection.setDoOutput(true);
connection.setReadTimeout(10000);
connection.setRequestMethod(POST); // the remote endpoint accepts this request as either a GET or POST just fine, except from this code
connection.setRequestProperty(CONTENT_TYPE, (xmlContent ? XML_ENCODED
: URL_ENCODED));
// set header values
connection.addRequestProperty(CLIENT_ID, header.getClientID());
if (header.getLocale() != null) {
connection.addRequestProperty(LOCALE, header.getLocale());
}
if (header.getSessionToken() != null) {
connection.addRequestProperty(SESSION, header.getSessionToken());
}
if (this.passthrough != null) {
connection.addRequestProperty(PASSTHRU, this.passthrough);
}
return connection;
}
我的服务器(FROM框)在Tomcat中运行Linux,Apache和我的应用程序。所有DNS查找都没有出现任何意外情况。在所有其他方面,框之间的连接似乎正常(我没有详尽地通过我的iptables
配置)。当我逐步执行代码时,一切看起来都很正常,直到执行消失到sun.net.www.protocol.http.HttpURLConnection.getInputStream()的编译代码中。
In GrepCode's OpenJDK source,第710行显示吞下IOException,但由于Oracle版本源是专有的(因此我无法在任何地方找到),我想知道是否有人知道(或者可能)指出我可能会发生什么事情,因为我还没能完全排除服务器环境中出现问题的可能性。
提前感谢任何见解!
答案 0 :(得分:4)
回答我的问题?:
从不信任他们的IT员工。
经过双重/三重检查后,发现远程服务器 前面是一个活动的入侵检测系统阻止所有未知的IP地址。由于AWS实例在循环时可以更改其IP,即使它们已列入已知IP的白名单,它也只会在我的实例被退回之后才能生效。获得的经验教训:在问“你能挡住我们吗?”时会有一种令人作呕的具体情况。
为什么他们允许curl
通过仍然是一个谜,直到我收到回复电子邮件以更新此答案...