我是Jsoup解析的新手。我能够进行凌空StringRequest
来获取网站,但我在浏览复杂的标签并解析它时遇到了问题。
远程HTML
//Skipped the meta and header because I don't need it.
...
<body class="sin">
<div class="ks">
<div class="wrap">
<div class="content-right-sidebar-wrap">
<main class="content">
//A lot of unneeded tags
<article class="post-1989009 post type-post post" itemscope="" itemtype="http://schema.org/CreativeWork">
<header class="post-header">
<h1 class="post-title" itemprop="headline">Yet Another 6GB RAM Phone: LeEco Le Max 2 Unveiled</h1>
</header>
//A lot of unneeded tags
<div class="post-content" itemprop="text">
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam nec nisi lectus. In consectetur nunc accumsan dui molestie, ut ultricies elit lobortis.
<a href="https://website.com/2002/03/odales-cursus-sed-eget-dolor.html">odales cursus sed eget dolor</a> Etiam arcu risus, aliquet porta pharetra non, pharetra in dui..
</p>
<p>
<img class="aligncenter size-full wp-image-19289" src="https://website.com/wp-content/uploads/2002/04/image-39.jpeg" alt="LeEco Le Max 2" width="800" height="450" srcset="https://website.com/wp-content/uploads/2002/09/gutter-bkan.jpeg 800w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-300x169.jpeg 300w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-768x432.jpeg 768w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-265x150.jpeg 265w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-320x180.jpeg 320w" sizes="(max-width: 800px) 100vw, 800px">
</p>
<p>Sed porta aliquet sollicitudin. Vivamus commodo placerat sapien vitae interdum. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus</p>
<p> eu massa volutpat, volutpat ipsum id, maximus risus. Etiam maximus lobortis enim sed eleifend. Integer imperdiet, augue accumsan ultricies faucibus, orci orci porttitor velit, semper fringilla</p>
<img class="aligncenter size-full wp-image-19290" src="https://website.com/wp-content/uploads/2002/07/guter-lop.jpeg" alt="LeEco Le Max 2" width="728" height="324" srcset="https://website.com/wp-content/uploads/2002/07/guter-lop.jpeg 728w, https://website.com/wp-content/uploads/2002/07/guter-lop-300x134.jpeg 300w" sizes="(max-width: 728px) 100vw, 728px">
</p>
<p>Sed nec nunc nec eros vulputate vehicula. Duis laoreet ex vel auctor finibus. Sed semper blandit massa, at molestie ligula vestibulum in. Nulla vestibulum viverra risus vitae fringilla</p>
<h2>Luccuii</h2>
<p>Leuismod ultrices libero at consequat. Quisque vestibulum vulputate vehicula. Vivamus posuere nibh tincidunt tristique faucibus. Integer sed vulputate dui, a luctus sem. Suspendisse potenti.</p>
</div>
//Skipped the closing tags
...
我正在使用此代码来获取并尝试解析它。
PostDetails
public class PostDetails extends AppCompatActivity{
...
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_post_details);
...
}
private void showDialog() {
internetDialog = new AlertDialog.Builder(PostDetails.this)
...
}
private void loadPost() {
Log.d(TAG, "loadPost called");
final ProgressBar progressBar;
progressBar = (ProgressBar) findViewById(R.id.progress_circle);
progressBar.setVisibility(View.VISIBLE);
String news_id = getIntent().getStringExtra("PostId");
Log.d(TAG, "You clicked post id " + news_id);
StringRequest stringRequest = new StringRequest(news_id,
new Response.Listener<String>() {
@Override
public void onResponse(String response) {
Log.d("Debug", response.toString());
if (progressBar != null) {
progressBar.setVisibility(View.GONE);
}
parseHtml(response);
}
},
new Response.ErrorListener() {
@Override
public void onErrorResponse(VolleyError error) {
VolleyLog.d("", "Error: " + error.getMessage());
if (progressBar != null) {
progressBar.setVisibility(View.GONE);
}
final AlertDialog.Builder sthWrongAlert = new AlertDialog.Builder(PostDetails.this);
...
sthWrongAlert.show();
}
});
//Creating requestqueue
RequestQueue requestQueue = Volley.newRequestQueue(this);
//Adding request queue
requestQueue.add(stringRequest);
}
private void parseHtml(String response) {
Log.d(TAG, "parsinghtml");
Document document = Jsoup.parse(response);
//This is where I intend to parse the html
//Element postTitle = document.select("");
}
}
我需要解析<h1 class="post-title" itemprop="headline">
和<div class="post-content" itemprop="text">
中的文字。
请帮我解决一下这个问题?