我正在建造刮板机。我试图改进此代码:
for (int i = 1; i < 6; i++) {
Elements siteElements = document.select("div.grid__col.grid__col--20-80-80.b-products-wrap > ul > li:nth-child(" + i + ")");
System.out.println(siteElements.select(" > div > div.b-products-list__desc-wrap > div > div.b-products-list__main-content > div.b-products-list__desc-prime > div.b-products-list__manufacturer-holder").select("a").first().text());
System.out.println(siteElements.select(" > div > div.b-products-list__desc-wrap > div > div.b-products-list__main-content > div.b-products-list__desc-prime > div.b-products-list__title-holder > a").first().text());
System.out.println(siteElements.select(" div.b-products-list__price-holder > a").first().text());
System.out.println(siteElements.first().attr("data-ppc-id"));
}
对于此代码(不要介意最后一行,即兴修改后,我知道它是错误的)。所以我参加了3 sys.out
> div > div.b-products-list__desc-wrap > div > div.b-products-list__main-content >
并将其放在siteElements
变量中(顺便说一句,该变量名好吗?)
for (int i = 1; i < 6; i++) {
Elements siteElements = document.select("div.grid__col.grid__col--20-80-80.b-products-wrap > ul > li:nth-child(" + i + ") > div > div.b-products-list__desc-wrap > div > div.b-products-list__main-content >");
System.out.println(siteElements.select(" div.b-products-list__desc-prime > div.b-products-list__manufacturer-holder").select("a").first().text());
System.out.println(siteElements.select(" div.b-products-list__desc-prime > div.b-products-list__title-holder > a").first().text());
System.out.println(siteElements.select(" div.b-products-list__price-holder > a").first().text());
//System.out.println(siteElements.first().attr("data-ppc-id"));
}
但随后出现异常:
Exception in thread "main" org.jsoup.select.Selector$SelectorParseException: Could not parse query '': unexpected token at ''
at org.jsoup.select.QueryParser.findElements(QueryParser.java:206)
at org.jsoup.select.QueryParser.parse(QueryParser.java:59)
at org.jsoup.select.QueryParser.parse(QueryParser.java:42)
at org.jsoup.select.QueryParser.combinator(QueryParser.java:87)
at org.jsoup.select.QueryParser.parse(QueryParser.java:67)
at org.jsoup.select.QueryParser.parse(QueryParser.java:42)
at org.jsoup.select.Selector.select(Selector.java:91)
at org.jsoup.nodes.Element.select(Element.java:363)
at Main.main(Main.java:23)
我在做什么错? 我从中抓取数据的网站:https://merlin.pl/bestseller/?option_80=10349074
答案 0 :(得分:1)
您不能以>
结尾选择器,因为这要求另一个选择器有效。只需删除它或使用> *
或类似的东西,就不会出现此异常。也许您需要进一步调整选择器才能获得所需的元素。