在尝试提取一些文本并通过此site删除文本escape characters
时,我遇到一些错误,将不胜感激。
该错误也适用于第二个循环,我正在使用strip
删除escape characters
。我想将['We prepare the saddle, and the goat presents itself; is it a burden for the lineage of goats?','You have been crowned a king, and yet you make good-luck charms; would you be crowned God?','We lift a saddle and the goat (kin) scowls; it is no burden for a sheep.']
附加到list
并将the text in the bracket
附加到另一个list
html = '
<p xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xq="http://namespaces.softwareag.com/tamino/XQuery/result" xmlns:xql="http://metalab.unc.edu/xql/">
A di gàárì sílẹ̀ ewúrẹ́ ńyọjú; ẹrù ìran rẹ̀ ni?<br>
We prepare the saddle, and the goat presents itself; is it a burden for the lineage of goats?<br>
(Goats that know their place do not offer their backs to be saddled.)<br>
This is a variant of A gbé gàárì ọmọ ewúrẹ́ ńrojú . . .<br>
</p>
<p xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xq="http://namespaces.softwareag.com/tamino/XQuery/result" xmlns:xql="http://metalab.unc.edu/xql/">
A fi ọ́ jọba ò ńṣàwúre o fẹ́ jẹ Ọlọ́run ni?<br>
You have been crowned a king, and yet you make good-luck charms; would you be crowned God?<br>
(Being crowned a king is about the best fortune a mortal could hope for.)<br>
</p>
<p xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xq="http://namespaces.softwareag.com/tamino/XQuery/result" xmlns:xql="http://metalab.unc.edu/xql/">
A fijó gba Awà; a fìjà gba Awà; bí a ò bá jó, bí a ò bá jà, bí a bá ti gba Awà, kò tán bí?<br>
By dancing we take possession of Awà; through fighting we take possession of Awà; if we neither dance nor fight, but take possession of Awà anyway, is the result not the same?<br>
(Why make a huge production of a matter that is easily taken care of?)<br>
</p>
<p xmlns:ino="http://namespaces.softwareag.com/tamino/response2" xmlns:xq="http://namespaces.softwareag.com/tamino/XQuery/result" xmlns:xql="http://metalab.unc.edu/xql/">
A gbé gàárì ọmọ ewurẹ ńrojú; kì í ṣe ẹrù àgùntàn.<br>
We lift a saddle and the goat (kin) scowls; it is no burden for a sheep.<br>
(The goat has no cause to scowl, because no one will condescend to ride it anyway.)<br>
This is a variant of A di gàárì sílẹ̀ . . .<br>
</p>'
from bs4 import BeautifulSoup
import requests
res = request.get(html)
soup = BeautifulSoup(res.content,'html.parser')
edu = {'Yoruba':[],'Translation':[],'Meaning':[]}
for i in range(0,220):
# first loop
for br in soup.select('p > br:nth-of-type(2)'):
text = br.previous_sibling
edu['Translation'].append(text.strip())
# second loop
for br in soup.select('p > br:nth-of-type(3)'):
text = br.previous_sibling.strip()
edu['Meaning'].append(text)
错误
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-186-6a48c8c3b845> in <module>
9 for br in soup8.select('p > br:nth-of-type(2)'):
10 text = br.previous_sibling
---> 11 edu['Translation'].append(text.strip())
12 # third loop
13 for br in soup8.select('p > br:nth-of-type(3)'):```
答案 0 :(得分:2)
append
没有return
语句,因此它返回None
。改为对插入的值调用strip()
。如果返回的值是str
for i in range(0,220):
# first loop
for br in soup.select('p > br:nth-of-type(2)'):
text = br.previous_sibling
if isinstance(text, str):
edu['Translation'].append(text.strip())
# second loop
for br in soup.select('p > br:nth-of-type(3)'):
text = br.previous_sibling
if isinstance(text, str):
edu['Meaning'].append(text.strip())