BeautifulSoup:类型对象'响应'没有len()

时间:2016-04-19 05:16:39

标签: python html parsing web-scraping beautifulsoup

问题:当我尝试执行脚本时,BeautifulSoup(html, ...)会给出错误消息" TypeError:类型为'响应'的对象没有len()。我尝试将实际的html作为参数传递,但它仍然无法正常工作。

import requests

url = 'http://vineoftheday.com/?order_by=rating'
response = requests.get(url)
html = response.content

soup = BeautifulSoup(html, "html.parser")

8 个答案:

答案 0 :(得分:25)

您将获得"jdbc:mysql:loadbalance://" + hosts + "/test"。但它将响应体返回为字节(docs)。但是你应该将response.content传递给BeautifulSoup构造函数(docs)。因此,您需要使用str而不是获取内容。

答案 1 :(得分:15)

尝试直接传递HTML文本

soup = BeautifulSoup(html.text)

答案 2 :(得分:6)

html.parser 用于忽略页面中的警告:

soup = BeautifulSoup(html.text, "html.parser")

答案 3 :(得分:0)

如果您使用requests.get('https://example.com')获取HTML,则应使用requests.get('https://example.com').text

答案 4 :(得分:0)

您仅在“响应”中获得响应代码 并始终使用浏览器标头来确保安全性 您将面临许多问题

在调试器控制台网络部分“标头” UserAgent中查找标头

尝试

import requests
from bs4 import BeautifulSoup

from fake_useragent import UserAgent

url = 'http://www.google.com'
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}

response = requests.get(quote_page, headers=headers).text

soup = BeautifulSoup(response, 'html.parser')
print(soup.prettify())

答案 5 :(得分:0)

对我有用:

soup = BeautifulSoup(requests.get("your_url").text)

现在,下面的代码更好(使用lxml解析器):

import requests
from bs4 import BeautifulSoup

soup = BeautifulSoup(requests.get("your_url").text, 'lxml')

答案 6 :(得分:0)

您应该使用 .text 来获取回复内容

import  requests
url = 'http://www ... '
response = requests.get(url)
print(response.text)

或与肥皂

一起使用
import  requests
from bs4 import BeautifulSoup

url = 'http://www ... '
response = requests.get(url)
msg = response.text
print(BeautifulSoup(msg,'html.parser'))

答案 7 :(得分:0)

import requests
from urllib.request import urlopen
from bs4 import BeautifulSoup
import re

url = "https://fortnitetracker.com/profile/all/DakshRungta123"
html = requests.get(url)

soup = BeautifulSoup(html)


title = soup.text
print(title.text)