import java.io.IOException;
import java.util.logging.Level;
import java.util.logging.Logger;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JavaApplication14 {
public static void main(String[] args) {
try {
Document doc = Jsoup.connect("tanmoy_mahathir.makes.org/thimble/146").get();
String html= "<html><head></head>" + "<body><p>Parsed HTML into a doc."
+ "</p></body></html>";
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
System.out.println(p.text());
} catch (IOException ex) {
Logger.getLogger(JavaApplication14.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
任何人都可以用jsoup代码帮助我如何解析包括段落的部分以便只打印
Hello ,World!
Nothing is impossible
答案 0 :(得分:3)
对于这一小部分html你只需要做
String html= "<html><head></head>" + "<body><p>Parsed HTML into a doc."+
+"</p></body></html>";
Document doc = Jsoup.parse(html);
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
System.out.println(p.text());
当我看到你的链接包含几乎相同的html时,你也可以用
替换doc
的定义
Document doc = Jsoup.connect("https://tanmoy_mahathir.makes.org/thimble/146").get();
<强>更新强>
以下是编译并运行正常的完整代码。
import java.io.IOException;
import java.util.logging.*;
import org.jsoup.*;
import org.jsoup.nodes.*;
import org.jsoup.select.*;
public class JavaApplication14 {
public static void main(String[] args) {
try {
String url = "https://tanmoy_mahathir.makes.org/thimble/146";
Document doc = Jsoup.connect(url).get();
Elements paragraphs = doc.select("p");
for(Element p : paragraphs)
System.out.println(p.text());
}
catch (IOException ex) {
Logger.getLogger(JavaApplication14.class.getName())
.log(Level.SEVERE, null, ex);
}
}
}
答案 1 :(得分:0)
你可以先试试这个......
String url = "url of the html page";
Document page = Jsoup.parse(url);
Elements elements = page.select("div[class=class_name] p");
答案 2 :(得分:0)
你可以用它的类来选择标签,然后可以更具体地像获取第一段