Question

我希望在将数据存储到Excel数据之前从数据中删除所有div。例如：

seven_days= soup.find('div', attrs={'class':'price'})
name= seven_days.text.strip()

它只能删除一个div，我想要这样的东西

seven_days= soup.find_all('div', attrs={'class':'price'})

但我无法删除div开始和结束标记。

Answer 1

虽然，你的要求对我来说不是很清楚。但我认为你可以使用findAll()

from BeautifulSoup import BeautifulSoup


raw_text = "text_with_div_tags"
soup = BeautifulSoup(raw_text)
seven_days = [] # For debugging only
names = []
for div_tag in soup.findAll('div', attrs={'class': 'price'}):
    seven_days.append(div_tag)  # For debugging only
    names.append(div_tag.text.strip())

#  names list has text information without div tags.
for name in names:
    print(name)

Answer 2

如果通过删除表示你想要提取它，你可以使用它（我猜你想要名为name的列表中的所有文本）： / p>

name = [x.text.strip() for x in soup.find_all('div', {'class':'price'})]

Answer 3

试试代码：

seven_days= [i.text.strip() for i in soup.find_all('div', attrs={'class':'price'})]
print(seven_days)

从Web报废提取的数据中删除所有div

3 个答案: