String [] urls包含URL作为字符串(代码读取每个URL的inputStream)。
我无法在第一个索引(索引0)之后遍历String [] URL的任何索引,即使for循环中的退出条件是'i< urls.length”。
注意:它在String [] urls size为1时有效。我在String [] urls size为2时测试它,在这种情况下只有第一个索引而不是第二个索引是迭代的。我只对<body>
块之间的内容感兴趣(因此if (s.contains("<br>")
)
有关为何发生这种情况的任何想法?
public void readData(String[] urls) {
for (int i=0; i<urls.length; i++) {
System.out.println(i); //for a String[] urls of size 2, only 0 gets printed.
//I want both 0 and 1 printed
String str="";
try {
URL url=new URL(urls[i]);
URLConnection conn=url.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String s;
while (( s = in.readLine())!=null) {
if (s.contains("<br>")) {
str += s;
}
}
} catch(Exception e) {
e.printStackTrace();
}
System.out.println(str); // for String[] urls of size 2,
//only the inputstream of urls' first index gets printed.
//I want both to be printed
}
}
编辑: 这是我想要阅读的html示例(String [] urls的每个元素带来的内容)
<html>
<head>
<title>
Title
</title>
</head>
<body>
Name1 Age1 Hometown1<br>
Name2 Age2 Hometown2<br>
Name3 Age3 Hometown3<br>
</body>
</html>
答案 0 :(得分:2)
我测试了这个,你的代码运行得很好。验证您从URL中提取的HTML,并确保它包含“br”标记,因为这是您的条件或删除此条件,您将获得任何HTML。
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
public class Main {
public static void readData(String[] urls) {
for (int i=0; i<urls.length; i++) {
String str="";
try {
URL url=new URL(urls[i]);
URLConnection conn=url.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String s;
while (( s = in.readLine())!=null)
if (s.contains("<br>")) {
str += s;
}
} catch(Exception e) {
e.printStackTrace();
}
System.out.println("Url No. " + i +"\n\n");
System.out.println(str +"\n");
}
}
public static void main(String[] args) {
String[] urls = {"http://google.com","http://google.com"};
readData(urls);
}
}