混淆了如何在制作volley StringRequest后用Jsoup解析HTML

时间:2016-04-24 09:32:20

标签: android jsoup

我是Jsoup解析的新手。我能够进行凌空StringRequest来获取网站,但我在浏览复杂的标签并解析它时遇到了问题。

远程HTML

//Skipped the meta and header because I don't need it.
...
<body class="sin">
<div class="ks">
    <div class="wrap">

        <div class="content-right-sidebar-wrap">
            <main class="content">

                //A lot of unneeded tags

                <article class="post-1989009 post type-post post" itemscope="" itemtype="http://schema.org/CreativeWork">
                    <header class="post-header"> 
                        <h1 class="post-title" itemprop="headline">Yet Another 6GB RAM Phone: LeEco Le Max 2 Unveiled</h1>
                    </header>

                    //A lot of unneeded tags

                    <div class="post-content" itemprop="text">
                        <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aliquam nec nisi lectus. In consectetur nunc accumsan dui molestie, ut ultricies elit lobortis.
                            <a href="https://website.com/2002/03/odales-cursus-sed-eget-dolor.html">odales cursus sed eget dolor</a> Etiam arcu risus, aliquet porta pharetra non, pharetra in dui..
                        </p>

                        <p>
                            <img class="aligncenter size-full wp-image-19289" src="https://website.com/wp-content/uploads/2002/04/image-39.jpeg" alt="LeEco Le Max 2" width="800" height="450" srcset="https://website.com/wp-content/uploads/2002/09/gutter-bkan.jpeg 800w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-300x169.jpeg 300w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-768x432.jpeg 768w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-265x150.jpeg 265w, https://website.com/wp-content/uploads/2002/09/gutter-bkan-320x180.jpeg 320w" sizes="(max-width: 800px) 100vw, 800px">
                        </p>
                        <p>Sed porta aliquet sollicitudin. Vivamus commodo placerat sapien vitae interdum. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus</p>

                        <p> eu massa volutpat, volutpat ipsum id, maximus risus. Etiam maximus lobortis enim sed eleifend. Integer imperdiet, augue accumsan ultricies faucibus, orci orci porttitor velit, semper fringilla</p>

                            <img class="aligncenter size-full wp-image-19290" src="https://website.com/wp-content/uploads/2002/07/guter-lop.jpeg" alt="LeEco Le Max 2" width="728" height="324" srcset="https://website.com/wp-content/uploads/2002/07/guter-lop.jpeg 728w, https://website.com/wp-content/uploads/2002/07/guter-lop-300x134.jpeg 300w" sizes="(max-width: 728px) 100vw, 728px">
                        </p>
                        <p>Sed nec nunc nec eros vulputate vehicula. Duis laoreet ex vel auctor finibus. Sed semper blandit massa, at molestie ligula vestibulum in. Nulla vestibulum viverra risus vitae fringilla</p>

                        <h2>Luccuii</h2>
                        <p>Leuismod ultrices libero at consequat. Quisque vestibulum vulputate vehicula. Vivamus posuere nibh tincidunt tristique faucibus. Integer sed vulputate dui, a luctus sem. Suspendisse potenti.</p>

                    </div>
                 //Skipped the closing tags
                   ...

我正在使用此代码来获取并尝试解析它。

PostDetails

public class PostDetails extends AppCompatActivity{

    ...


    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_post_details);

        ...
    }

    private void showDialog() {
        internetDialog = new AlertDialog.Builder(PostDetails.this)
        ...
    }

    private void loadPost() {
        Log.d(TAG, "loadPost called");

        final ProgressBar progressBar;
        progressBar = (ProgressBar) findViewById(R.id.progress_circle);
        progressBar.setVisibility(View.VISIBLE);


        String news_id = getIntent().getStringExtra("PostId");
        Log.d(TAG, "You clicked post id " + news_id);

        StringRequest stringRequest = new StringRequest(news_id,
                new Response.Listener<String>() {
                    @Override
                    public void onResponse(String response) {
                        Log.d("Debug", response.toString());
                        if (progressBar != null) {
                            progressBar.setVisibility(View.GONE);
                        }
                        parseHtml(response);


                    }
                },
                new Response.ErrorListener() {
                    @Override
                    public void onErrorResponse(VolleyError error) {
                        VolleyLog.d("", "Error: " + error.getMessage());

                        if (progressBar != null) {
                            progressBar.setVisibility(View.GONE);
                        }

                        final  AlertDialog.Builder sthWrongAlert = new AlertDialog.Builder(PostDetails.this);
                        ...
                        sthWrongAlert.show();
                    }
                });

        //Creating requestqueue
        RequestQueue requestQueue = Volley.newRequestQueue(this);

        //Adding request queue
        requestQueue.add(stringRequest);
    }





    private void parseHtml(String response) {
        Log.d(TAG, "parsinghtml");
        Document document = Jsoup.parse(response);

        //This is where I intend to parse the html
        //Element postTitle = document.select("");
    }


}

我需要解析<h1 class="post-title" itemprop="headline"><div class="post-content" itemprop="text">中的文字。

请帮我解决一下这个问题?

1 个答案:

答案 0 :(得分:1)

你可以这样做:

{{1}}

查看cookbook