这是我的代码我想从网站上删除一个单词列表, 但是当我在
上调用.string时import requests
from bs4 import BeautifulSoup
url = "https://www.merriam-webster.com/browse/thesaurus/a"
source_code = requests.get(url)
plain_text = source_code.text
soup = BeautifulSoup(plain_text, "html.parser")
entry_view = soup.find_all('div', {'class': 'entries'})
view = entry_view[0]
list = view.ul
for m in list:
for x in m:
title = x.string
print(title)
我想要的是从网站上打印文本的列表,但我得到的是错误
Traceback (most recent call last):
File "/home/vidu/PycharmProjects/untitled/hello.py", line 14, in <module>
title = x.string
AttributeError: 'str' object has no attribute 'string'
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/apport_python_hook.py", line 63, in apport_excepthook
from apport.fileutils import likely_packaged, get_recent_crashes
File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
from apport.report import Report
File "/usr/lib/python3/dist-packages/apport/report.py", line 30, in <module>
import apport.fileutils
File "/usr/lib/python3/dist-packages/apport/fileutils.py", line 23, in <module>
from apport.packaging_impl import impl as packaging
File "/usr/lib/python3/dist-packages/apport/packaging_impl.py", line 23, in <module>
import apt
File "/usr/lib/python3/dist-packages/apt/__init__.py", line 23, in <module>
import apt_pkg
ModuleNotFoundError: No module named 'apt_pkg'
Original exception was:
Traceback (most recent call last):
File "/home/vidu/PycharmProjects/untitled/hello.py", line 14, in <module>
title = x.string
AttributeError: 'str' object has no attribute 'string'
答案 0 :(得分:3)
您可以使用以下代码来实现您的目标。
<强>代码:强>
import requests
from bs4 import BeautifulSoup
url = "https://www.merriam-webster.com/browse/thesaurus/a"
html_source = requests.get(url).text
soup = BeautifulSoup(html_source, "html.parser")
entry_view = soup.find_all('div', {'class': 'entries'})
entries = []
for elem in entry_view:
for e in elem.find_all('a'):
entries.append(e.text)
#show only 5 elements and whole list length
print(entries[:5])
print(entries[-5:])
print(len(entries))
<强>输出:强>
['A1', 'aback', 'abaft', 'abandon', 'abandoned']
['absorbing', 'absorption', 'abstainer', 'abstain from', 'abstemious']
100
在您的代码中
print(type(list))
<class 'bs4.element.Tag'>
print(type(m))
<class 'bs4.element.NavigableString'>
print(type(x))
<class 'str'>
因此,正如您所看到的,变量x
已经是一个字符串,因此使用bs4 method .string()
是没有意义的。
p.s。:您不应该使用list
之类的变量名称,它是一个保留的关键字。
答案 1 :(得分:1)
AttributeError:'str'对象没有属性'string'
这告诉你该对象已经是一个字符串。尝试删除它,它应该工作。
它还告诉您字符串数据类型的正确语法是str
而不是string
。
另外一件事就是你使用title = str(x)
转换,但因为在这种情况下它已经是一个字符串,所以它是多余的。
引用Google:
Python有一个名为“str”的内置字符串类,有许多方便的功能(有一个名为“string”的旧模块,你不应该使用它)