找到标签内容后出现问题,无法求和

时间:2019-01-17 23:05:57

标签: python beautifulsoup

我正在尝试从html页面中提取标签内容并对内容(字符串)求和,这是到目前为止的代码

import urllib.request, urllib.parse, urllib.error
from bs4 import BeautifulSoup

url = input('Enter- ')
html = urllib.request.urlopen(url).read()
soup = BeautifulSoup(html, 'html.parser')

# Retrieve all of the span tags
tags = soup('span')
for tag in tags:
   # Look at the parts of a tag
   print('Sum of Contents:',sum(int(tag.contents[0])))

在没有sum(int())的情况下,它会正确返回值的字符串,但是我正在尝试将字符串更改为整数并求和。我以为我搞砸了一些很基本的东西?

Contents: 97
Contents: 97
Contents: 90
Contents: 90
Contents: 88
Contents: 87
Contents: 87
Contents: 80
Contents: 79
Contents: 79
Contents: 78
Contents: 76
Contents: 76
Contents: 72
Contents: 72
Contents: 66
Contents: 66
Contents: 65
Contents: 65
Contents: 64
Contents: 61
Contents: 61
Contents: 59
Contents: 58
Contents: 57
Contents: 57
Contents: 54
Contents: 51
Contents: 49
Contents: 47
Contents: 40
Contents: 38
Contents: 37
Contents: 36
Contents: 36
Contents: 32
Contents: 25
Contents: 24
Contents: 22
Contents: 21
Contents: 19
Contents: 18
Contents: 18
Contents: 14
Contents: 12
Contents: 12
Contents: 9
Contents: 7
Contents: 3
Contents: 2

1 个答案:

答案 0 :(得分:3)

尝试使用列表推导先收集所有整数,然后求和

summation = sum([int(tag.contents[0]) for tag in tags])
print('Sum of Contents:',summation)

如果您不想使用列表推导,可以使用

summation = []
for tag in tags:
    summation.append(int(tag.contents[0]))
print('Sum of Contents:', sum(summation))