Question

我的目标是使用 JSoup 从食谱页面中提取成分列表。我设法从网站上获得了我的第一个列表条目，但是我的 for 循环似乎在第一个条目处停止，而没有收集下一个 5。

我不确定我做错了什么，所以如果你能看看我的代码，我将不胜感激：

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;

public class WebScrape {
    public static void main(String[] args) {
        scrapeBBC("https://www.bbcgoodfood.com/recipes/spanish-omelette");
    }

    static void scrapeBBC(String url){
        try{
            Document recipe = Jsoup.connect(url).get();

            for(Element ingredients : recipe.select("section.recipe__ingredients.col-12.mt-md.col-lg-6")){
                //TODO: if problems occur with null entries add if-else as suggested in the video
                int row = 0;
                final String ingredient = ingredients.select(
                        "li.list-item--separator.list-item.pt-xxs.pb-xxs:nth-of-type("+ row++ +")").text();
                System.out.println(row + ingredients.select(
                        "li.list-item--separator.list-item.pt-xxs.pb-xxs:nth-of-type("+ row++ +")").text());

                //System.out.println(row + ingredient);
            }

        }catch(IOException ioe){
            System.out.println("Unable to connect to the URL.");
            ioe.printStackTrace();
        }
    }
}

提前致谢！

Answer 1

首先选择 ingredients 部分。

Element ingredients = recipe.select("section.recipe__ingredients.col-12.mt-md.col-lg-6").first();

然后遍历该部分中存在的 <li> 元素。

int row = 0;
for (Element ingredient : ingredients.select("li.list-item--separator.list-item.pt-xxs.pb-xxs")) {
    System.out.println(++row + " : " + ingredient.text());
}

顺便说一句，您的选择器不必非常具体；以下选择器可以正常工作。

recipe.select("section.recipe__ingredients")
ingredients.select("li")

使用 JSoup 从网站收集列表条目时被抛出我的 for 循环

1 个答案: