Question

我正在开展一个个人项目，这个项目会刮掉我大学的餐厅菜单，并返回即将到来的一周的每日甜点菜单。我正在使用漂亮的汤来做到这一点，但我不确定我是否正确使用它，因为我的代码似乎是间接的和重复的。有没有办法在没有中间步骤的情况下一直跳到我的最后一行？这是我现在拥有的：

soup = bs.BeautifulSoup(sauce, 'lxml')
for column in soup.find_all('div', class_='menu-details-day'): # Looks at the menu for each day
    for station in column.find_all('div',class_='menu-details-station'): # Looks at each station
        if station.h4.string == 'Dessert' :
            for item in station.find_all('div',class_='menu-name'): # Looks at each item served at the dessert station
                # append items to list

为了澄清我的预期输出，我试图获得一天的每个甜点项目，然后我将其附加到与该日相对应的列表中。这是我要抓的links之一。

Answer 1

如果你想要一个更好的方法，因为不想要箭头代码，你可以使用itertools将这个逻辑变成一个生成器管道

from itertools import chain

soup = bs.BeautifulSoup(sauce, 'lxml')

# extract all stations for each day
stations = chain(*(
    col.find_all('div',class_='menu-details-station') 
    for col in soup.find_all('div', class_='menu-details-day')
))
desserts = chain(*(
    station.find_all('div',class_='menu-name') 
    for station in stations
    if station.h4.string == 'Dessert'
))

for dessert in desserts:
    print(dessert)

有没有更好的方法来使用漂亮的汤来刮取深埋在html文件中的数据？

1 个答案: