urllib.error.HTTPError:HTTP错误400:Python函数中的错误请求

时间:2017-03-19 03:22:27

标签: python function python-3.5 urllib bs4

我正在制作一个程序,根据某些输入(目前为成分)搜索食谱。当我只搜索一些成分时,该程序可以工作,但是还有一些会返回urllib错误。我查看了其他问题,但他们是为urllib 2而他们的解决方案并没有解决我的问题。

链接工作地点(搜索一些成分) - http://allrecipes.com/search/results/?ingIncl=Chicken&ingExcl=beef&sort=re

不匹配的地方(搜索更多内容) - http://allrecipes.com/search/results/?ingIncl=chicken,cheese,egg&ingExcl=lettuce&sort=re

*我的代码

from bs4 import BeautifulSoup
import urllib
import urllib.request


html = "http://allrecipes.com/search/results/?ingIncl='chicken', 'cheese', 'egg'&ingExcl='lettuce'&sort=re"
number = 5



def get_recipes(html, number):######## this doesnt work if there are a few ingredients

    html = urllib.request.urlopen(html)

    soup = BeautifulSoup(html, "html.parser")

    num_results = soup.find('span',{'class': 'subtext'}).get_text()
    num_results = str(number) + ' out of ' + num_results #number will have to be changed if less recipes were found than number

    i = 0
    recipe_dict = {}


    for card in soup.find_all('article', {'class':'grid-col--fixed-tiles'}): #gets 2 more than required
        try: 
            info = card.find('a', {'data-internal-referrer-link':'hub recipe'})
            link = info.get('href')
            name = info.get_text()
            recipe_dict[name] = link
            if i > (number - 2): #-2 is temp fix
                break
            else:
                i += 1
        except:
            pass
    print(recipe_dict)
    return recipe_dict

get_recipes(html, number)

错误:

Traceback (most recent call last):
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\Diet Buddy\DB_find_recipes.py", line 39, in <module>
    get_recipes(html, number)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\Diet Buddy\DB_find_recipes.py", line 13, in get_recipes
    html = urllib.request.urlopen(html)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 472, in open
    response = meth(req, response)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 582, in http_response
    'http', request, response, code, msg, hdrs)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 510, in error
    return self._call_chain(*args)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 444, in _call_chain
    result = func(*args)
  File "C:\Users\bakat\AppData\Local\Programs\Python\Python35-32\lib\urllib\request.py", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

1 个答案:

答案 0 :(得分:0)

似乎是一个简单的错字。比较:

html = "http://allrecipes.com/search/results/?ingIncl='chicken', 'cheese', 'egg'&ingExcl='lettuce'&sort=re"

html = "http://allrecipes.com/search/results/?ingIncl=chicken,cheese,egg&ingExcl=lettuce&sort=re"