我如何从Java的主页(包括jsoup)中删除某个项目

时间:2019-11-13 22:51:16

标签: java web-scraping jsoup

下面我有代码来抓取yelp页面。我只需要控制台中的网站网址。因此,仅在此示例中,URL'cube-rieger.de'(在noopener nofollow之后)

<a href="/biz_redir?url=http%3A%2F%2Fwww.cube-rieger.de&amp;website_link_type=website&amp;src_bizid=q_PKB5C34yMiQ8JfvN2gkg&amp;cachebuster=1573659980&amp;s=80a10c01ecab48c960a0145decb9e8f8c7502d7f239f5a799568cfe9ec1748bd" target="_blank" rel="noopener nofollow">cube-rieger.de</a>

这是我的剪贴代码:

package methoden;

import java.io.IOException;

import org.jsoup.*;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

public class JsoupRun {

        public static void main(String[] args) throws IOException {

            String url = "https://www.yelp.com/biz/zahn%C3%A4rzte-dr-g-cube-dr-r-cube-"
                    + "und-dr-d-rieger-stuttgart?adjust_creative=LkD6tqXBfUmRYWw5Kapg"
                    + "6Q&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&"
                    + "utm_source=LkD6tqXBfUmRYWw5Kapg6Q";

            Document document = Jsoup.connect(url).get();
            Elements links = document.select("noopener nofollow");

            for (Element link : links) {

                System.out.println("link : " + link.attr("href"));
                System.out.println("text : " + link.text());
            }
        }
    }

有人可以解决我这个问题吗?

1 个答案:

答案 0 :(得分:0)

我猜您正在寻找Element.text(),如Api page所述。那应该回来

  

cube-rieger.de