所以我是java.please的新手,如果可能的话,提供一些示例代码。 情况是我在文本文件中有一个html格式。我需要读取文件并在“数据名称”模式后找到字符串。我需要通过整个文本文件找到“数据名称”之后的每个字符串。我在网上做了一些研究。我已经使用html解析器获取html并将其存储在文本文件中。我知道我可能需要使用正则表达式。所以请帮帮我。谢谢你们!
下面是我获取html的代码。结果是连接的。
public static void main(String[] args) {
try {
URL url = new URL("https://twitter.com/search?q=%23JENOSMROOKIESOPENFOLBACK&src=tren");
// read text returned by server
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String line;
PrintWriter out = new PrintWriter(new FileWriter("C:/Users/Desktop/htmlsourcecode.txt"));
while ((line = in.readLine()) != null) {
System.out.println(line);
out.print(line);
}
out.close();
}
答案 0 :(得分:1)
这样的事情
// External resource(s).
BufferedReader in = null;
PrintWriter out = null;
try {
URL url = new URL(
"https://twitter.com/search?q=%23JENOSMROOKIESOPENFOLBACK&src=tren");
// read text returned by server
in = new BufferedReader(new InputStreamReader(
url.openStream()));
String line;
// out = new PrintWriter(new FileWriter(
// "htmlsourcecode.txt"));
final String DATA_NAME = "data-name=\"";
while ((line = in.readLine()) != null) {
int pos1 = line.indexOf(DATA_NAME); // opening position.
if (pos1 > -1) { // did we match?
// Add the length of the string.
pos1 += DATA_NAME.length();
// find the closing quote.
int pos2 = line.indexOf("\"", pos1 + 1);
if (pos2 > -1) {
String dataName = line.substring(pos1,
pos2);
System.out.println(dataName);
// out.print(line);
}
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
// Close external resource(s).
if (in != null) {
try {
in.close();
} catch (IOException e) {
}
}
if (out != null) {
out.close();
}
}