使用try-catch跳过异常会导致陷入无限循环

时间:2015-11-27 02:58:36

标签: java exception exception-handling httpurlconnection connection-timeout

我有这样的困难,对于某些网址,BufferedReader到达Connection timeout并抛出一个中断整个程序的异常。我需要的是检查连接打开的时间以及是否达到阈值(必须小于超时的阈值),它会跳过该URL以打开流然后获取下一个URL或者它以不会导致程序停止的方式处理超时。有什么想法怎么做?

URL url = new URL(line);
URLConnection connection = url.openConnection();
if (connection instanceof HttpURLConnection) {
HttpURLConnection httpConn = (HttpURLConnection) connection;
int statusCode = httpConn.getResponseCode();
if (statusCode <= 200 && statusCode < 300)
  try{                 
   BufferedReader brURL = new BufferedReader(new InputStreamReader(url.openStream()));    
   while((tempLine = brURL.readLine())!=null){                          
   UrlMatcher=UrlPattern.matcher(tempLine);
   java.util.logging.Logger.getLogger(SimpleCrawler.class.getName()).log(Level.SEVERE, tempLine);
           if(UrlMatcher.find())
            {
                String resultURL=UrlMatcher.group();
                fop.write(resultURL.toLowerCase().getBytes());
                fop.write(System.getProperty("line.separator").getBytes());

               System.out.println(resultURL);
          }


                                            }                
  }
                           catch(ConnectException ex){}

             }

导致此异常:

Exception in thread "main" java.net.ConnectException: Connection timed out: connect
    at java.net.DualStackPlainSocketImpl.connect0(Native Method)
    at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:345)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:180)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:211)
    at sun.net.www.http.HttpClient.New(HttpClient.java:308)
    at sun.net.www.http.HttpClient.New(HttpClient.java:326)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1168)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1104)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:998)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:932)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1512)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1440)
    at java.net.URL.openStream(URL.java:1038)
    at simplecrawler.SimpleCrawler.main(SimpleCrawler.java:61)

编辑使用try-catch,现在它在执行的其他部分陷入无限循环。

编辑2

通过在logger之前添加if(UrlMatcher.find()),在while循环中,当它进入无限循环时,它会显示以下日志(为了进一步清晰,我在日志之前包含了最后一个匹配)

 rum-static.pingdom.net/prum.min.js //the last match
SEVERE: var flashvars = {};
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: flashvars.enableAPI = "true";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: flashvars.galleryURL = "/svgallerysource.asp?galleryid=685";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: var params = {};
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: params.bgcolor = "222222";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: params.allowfullscreen = false;
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: params.allowscriptaccess = "always";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: params.wmode = "transparent";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: var attributes = {};
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: attributes.id =  "svInstance";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: attributes.name = "svInstance";
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: simpleviewer.ready(function () {
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: simpleviewer.load('flashContent', '920', '420', '222222', true, flashvars, params, attributes, true); 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: }); 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: </script>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <link href="http://cdn-images.mailchimp.com/embedcode/slim-081711.css" rel="stylesheet" type="text/css">
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <style type="text/css">
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:      #mc_embed_signup{background:#fff; clear:left; font:14px Helvetica,Arial,sans-serif; }
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:  </style>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js"></script>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <script type="text/javascript" src="/jplayer/jquery.jplayer.min.js"></script>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <script type="text/javascript" src="/jplayer/jquery.jplayer.inspector.js"></script>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <link rel="stylesheet" href="/css/colorbox.css" />
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <script>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: var _prum = [['id', '5397955dabe53dbb3ea78d70'],
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:              ['mark', 'firstbyte', (new Date()).getTime()]];
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: (function() {
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:     var s = document.getElementsByTagName('script')[0]
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:       , p = document.createElement('script');
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:     p.async = 'async';
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:     p.src = '//rum-static.pingdom.net/prum.min.js';
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:     s.parentNode.insertBefore(p, s);
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: })();
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: </script>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: <style>
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE:     
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: body
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: {
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: background-color: #ffffff;
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: }
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: div#bodycontainer-home
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: {
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: background-color: 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: #ffffff; 
Nov 27, 2015 6:53:27 PM simplecrawler.SimpleCrawler openConnection
SEVERE: background-image:url(/images/uploaded/540973958472458.png); 

1 个答案:

答案 0 :(得分:1)

您应该使用setConnectTimeout,然后抓住 SocketTimeoutException

try { 
HttpURLConnection con = (HttpURLConnection) new URL(url).openConnection();
 con.setConnectTimeout(5000); //set timeout to 5 seconds 
return (con.getResponseCode() == HttpURLConnection.HTTP_OK); 
} 
catch (java.net.SocketTimeoutException e) { return false; }

请在此处查看documentation