首先,我只是一个网络程序员,所以对实际编程的经验很少。 我已经获得了30,000个网址的列表,我不会浪费我的时间点击每个网址以检查它们是否有效 - 有没有办法通读文本文件,他们在哪里,并有一个程序检查每一行?
我目前拥有的代码是java中的所有我知道的所以如果有更好的语言,请告诉我。 以下是我到目前为止的情况:
public class UrlCheck {
public static void main(String[] args) throws IOException {
URL url = new URL("http://www.google.com");
//Need to change this to make it read from text file
try {
InputStream inp = null;
try {
inp = url.openStream();
} catch (UnknownHostException ex) {
System.out.println("Invalid");
}
if (inp != null) {
System.out.println("Valid");
}
} catch (MalformedURLException exc) {
exc.printStackTrace();
}
}
}
答案 0 :(得分:2)
首先,您使用BufferedReader
逐行阅读文件并检查每一行。下面的代码应该工作。当您遇到无效的URL时,由您决定要做什么。您可以在我显示或写入另一个文件时打印它。
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import java.rmi.UnknownHostException;
public class UrlCheck {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("_filename"));
String line;
while ((line = br.readLine()) != null) {
if(checkUrl(line)) {
System.out.println("URL " + line + " was OK");
} else {
System.out.println("URL " + line + " was not VALID"); //handle error as you like
}
}
br.close();
}
private static boolean checkUrl(String pUrl) throws IOException {
URL url = new URL(pUrl);
//Need to change this to make it read from text file
try {
InputStream inp = null;
try {
inp = url.openStream();
} catch (UnknownHostException ex) {
System.out.println("Invalid");
return false;
}
if (inp != null) {
System.out.println("Valid");
return true;
}
} catch (MalformedURLException exc) {
exc.printStackTrace();
return false;
}
return true;
}
}
checkUrl
方法也可简化如下
private static boolean checkUrl(String pUrl) {
URL url = null;
InputStream inp = null;
try {
url = new URL(pUrl);
inp = url.openStream();
return inp != null;
} catch (IOException e) {
e.printStackTrace();
return false;
} finally {
try {
if (inp != null) {
inp.close();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
答案 1 :(得分:0)
你可以使用httpURLConnection。如果它无效,你将无法得到任何回报。
HttpURLConnection connection = null;
try{
URL myurl = new URL("http://www.myURL.com");
connection = (HttpURLConnection) myurl.openConnection();
//Set request to header to reduce load
connection.setRequestMethod("HEAD");
int code = connection.getResponseCode();
System.out.println("" + code);
} catch {
//Handle invalid URL
}
答案 2 :(得分:0)
我不确定您的体验,但这里可以使用多线程解决方案。在阅读文本文件时,将URL存储在线程安全的结构中,并允许许多线程继续尝试打开这些连接。这将提供更有效的解决方案,因为在您阅读时可能需要一段时间来测试30000网址。
如果您不确定,请查看生产者 - 消费者示例:
http://www.journaldev.com/1034/java-blockingqueue-example-implementing-producer-consumer-problem
答案 3 :(得分:0)
public class UrlCheck {
public static void main(String[] args) {
try {
URL url = new URL("http://www.google.com");
//Open the Http connection
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
//Get the http response code
int responceCode = connection.getResponseCode();
if (responceCode == HttpURLConnection.HTTP_OK) //if the http response code is 200 OK so the url is valid
{
System.out.println("Valid");
} else //Else the url is not valid
{
System.out.println("Invalid");
}
} catch (MalformedURLException ex) {
System.out.println("Invalid");
} catch (IOException ex) {
System.out.println("Invalid");
}
}
}