我正在制作一个程序,根据某些输入(目前为成分)搜索食谱。当我只搜索一些成分时,该程序可以工作,但是还有一些会返回urllib错误。我查看了其他问题,但他们是为urllib 2而他们的解决方案并没有解决我的问题。
链接工作地点(搜索一些成分) - http://allrecipes.com/search/results/?ingIncl=Chicken&ingExcl=beef&sort=re
不匹配的地方(搜索更多内容) - http://allrecipes.com/search/results/?ingIncl=chicken,cheese,egg&ingExcl=lettuce&sort=re
*我的代码
from bs4 import BeautifulSoup
import urllib
import urllib.request
html = "http://allrecipes.com/search/results/?ingIncl='chicken', 'cheese', 'egg'&ingExcl='lettuce'&sort=re"
number = 5
def get_recipes(html, number):######## this doesnt work if there are a few ingredients
html = urllib.request.urlopen(html)
soup = BeautifulSoup(html, "html.parser")
num_results = soup.find('span',{'class': 'subtext'}).get_text()
num_results = str(number) + ' out of ' + num_results #number will have to be changed if less recipes were found than number
i = 0
recipe_dict = {}
for card in soup.find_all('article', {'class':'grid-col--fixed-tiles'}): #gets 2 more than required
try:
info = card.find('a', {'data-internal-referrer-link':'hub recipe'})
link = info.get('href')
name = info.get_text()
recipe_dict[name] = link
if i > (number - 2): #-2 is temp fix
break
else:
i += 1
except:
pass
print(recipe_dict)
return recipe_dict
get_recipes(html, number)
错误:
Traceback (most recent call last):
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\Diet Buddy\DB_find_recipes.py", line 39, in <module>
get_recipes(html, number)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\Diet Buddy\DB_find_recipes.py", line 13, in get_recipes
html = urllib.request.urlopen(html)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 472, in open
response = meth(req, response)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 582, in http_response
'http', request, response, code, msg, hdrs)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 510, in error
return self._call_chain(*args)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 444, in _call_chain
result = func(*args)
File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 590, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
答案 0 :(得分:0)
似乎是一个简单的错字。比较:
html = "http://allrecipes.com/search/results/?ingIncl='chicken', 'cheese', 'egg'&ingExcl='lettuce'&sort=re"
和
html = "http://allrecipes.com/search/results/?ingIncl=chicken,cheese,egg&ingExcl=lettuce&sort=re"