逐行读取文本文件的最佳方法,将每一行放入代码中

时间:2014-04-10 08:13:12

标签: java

首先,我只是一个网络程序员,所以对实际编程的经验很少。 我已经获得了30,000个网址的列表,我不会浪费我的时间点击每个网址以检查它们是否有效 - 有没有办法通读文本文件,他们在哪里,并有一个程序检查每一行?

我目前拥有的代码是java中的所有我知道的所以如果有更好的语言,请告诉我。 以下是我到目前为止的情况:

public class UrlCheck {

    public static void main(String[] args) throws IOException {
        URL url = new URL("http://www.google.com");
        //Need to change this to make it read from text file
        try {
            InputStream inp = null;
            try {
                inp = url.openStream();
            } catch (UnknownHostException ex) {
                System.out.println("Invalid");
            }
            if (inp != null) {
                System.out.println("Valid");
            }
        } catch (MalformedURLException exc) {
            exc.printStackTrace();
        }
    }
}

4 个答案:

答案 0 :(得分:2)

首先,您使用BufferedReader逐行阅读文件并检查每一行。下面的代码应该工作。当您遇到无效的URL时,由您决定要做什么。您可以在我显示或写入另一个文件时打印它。

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.rmi.UnknownHostException;

public class UrlCheck {

    public static void main(String[] args) throws IOException {

        BufferedReader br = new BufferedReader(new FileReader("_filename"));
        String line;
        while ((line = br.readLine()) != null) {
           if(checkUrl(line)) {
               System.out.println("URL " + line + " was OK");
           } else {
               System.out.println("URL " + line + " was not VALID"); //handle error as you like
           }
        }

        br.close();
    }

    private static boolean checkUrl(String pUrl) throws IOException {
        URL url = new URL(pUrl);
        //Need to change this to make it read from text file
        try {
            InputStream inp = null;

            try {
                inp = url.openStream();
            } catch (UnknownHostException ex) {
                System.out.println("Invalid");
                return false;
            }
            if (inp != null) {
                System.out.println("Valid");
                return true;
            }
        } catch (MalformedURLException exc) {
            exc.printStackTrace();
            return false;
        }

        return true;
    }
}

checkUrl方法也可简化如下

private static boolean checkUrl(String pUrl) {
    URL url = null;
    InputStream inp = null;
    try {
        url = new URL(pUrl);
        inp = url.openStream();

        return inp != null;
    } catch (IOException e) {
        e.printStackTrace();
        return false;
    } finally {
        try {
            if (inp != null) {
                inp.close();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

答案 1 :(得分:0)

你可以使用httpURLConnection。如果它无效,你将无法得到任何回报。

HttpURLConnection connection = null;
try{         
    URL myurl = new URL("http://www.myURL.com");        
    connection = (HttpURLConnection) myurl.openConnection(); 

    //Set request to header to reduce load 
    connection.setRequestMethod("HEAD");         
    int code = connection.getResponseCode();        
    System.out.println("" + code); 
} catch {
//Handle invalid URL
}

答案 2 :(得分:0)

我不确定您的体验,但这里可以使用多线程解决方案。在阅读文本文件时,将URL存储在线程安全的结构中,并允许许多线程继续尝试打开这些连接。这将提供更有效的解决方案,因为在您阅读时可能需要一段时间来测试30000网址。

如果您不确定,请查看生产者 - 消费者示例:

http://www.journaldev.com/1034/java-blockingqueue-example-implementing-producer-consumer-problem

答案 3 :(得分:0)

public class UrlCheck {

    public static void main(String[] args) {
        try {
            URL url = new URL("http://www.google.com");
            //Open the Http connection
            HttpURLConnection connection = (HttpURLConnection) url.openConnection();
            //Get the http response code
            int responceCode = connection.getResponseCode();
            if (responceCode == HttpURLConnection.HTTP_OK) //if the http response code is 200 OK so the url is valid
            {
                System.out.println("Valid");
            } else //Else the url is not valid
            {
                System.out.println("Invalid");
            }
        } catch (MalformedURLException ex) {
            System.out.println("Invalid");
        } catch (IOException ex) {
            System.out.println("Invalid");
        }
    }
}