以下脚本列出了<ar-save-item>
。
def getrec():
import requests
from bs4 import BeautifulSoup
recipe_list=[]
recipes=[]
result=[]
key = "Paneer"
url = "http://allrecipes.com/search/results/?wt="+key+"&sort=re"
print(url);
r=[]
response = requests.get(url)
try:
result_page=BeautifulSoup(response.content,'lxml')
r=result_page.find_all('ar-save-item')
for res in r:
print(r);
但是,我想在tag中显示class-id值。怎么去呢?
输出如下所示:
[<ar-save-item class="favorite" data-id="73715" data-imageurl="'http://images.media-allrecipes.com/userphotos/250x250/00/42/82/428269.jpg'" data-name='"Paneer"' data-type="'Recipe'"></ar-save-item>, <ar-save-item class="favorite" data-id="212521" data-imageurl="'http://images.media-allrecipes.com/userphotos/250x250/00/32/99/329922.jpg'" data-name='"Shahi Paneer"' data-type="'Recipe'"></ar-save-item>, <ar-save-item class="favorite" data-id="221826" data-imageurl="'http://images.media-allrecipes.com/userphotos/250x250/01/03/63/1036376.jpg'" data-name='"Palak Paneer (Indian Spinach and Paneer)"' data-type="'Recipe'"></ar-save-item>
结果需要什么:
data-id="73715"
data-id="212521"
等等等等。请帮忙。
答案 0 :(得分:0)
res
是dict
。您可以按res['data-id']
或get()
方法res.get('data-id')
获取值。如果没有get()
属性但使用None
作为data-id
中的键会引发异常,则最好使用data-id
,因为它返回res
。
import requests
from bs4 import BeautifulSoup
def getrec():
key = "Paneer"
url = "http://allrecipes.com/search/results/?wt="+key+"&sort=re"
response = requests.get(url)
result_page = BeautifulSoup(response.content,'lxml')
r = result_page.find_all('ar-save-item')
for res in r:
print('data-id =', res.get('data-id'))
getrec()
<强>输出强>
data-id = 73715
data-id = 212521
data-id = 221826
data-id = 212756
data-id = 232201
data-id = 222787
data-id = 232203
data-id = 240652
data-id = 138127
data-id = 256164
data-id = 221828
data-id = 212814
data-id = 106159
data-id = 159147
data-id = 86602
data-id = 237491
data-id = 213235
data-id = 228957
data-id = 228899
data-id = 232202