Question

我正在了解re和BeautifulSoup的模块。我对下一个代码的几行有疑问。我不知道使用group()以及contents[]

中括号内的内容

from bs4 import BeautifulSoup
import urllib2
import re

url = 'http://www.ebay.es/itm/LOTE-5-BOTES-CERVEZAARGUS-SET-5-BEER-CANSLOT-5-CANETTES-BIRES-LATTINE-BIRRA-/321162173293'  #raw_input('URL: ')   
code = urllib2.urlopen(url).read();
soup = BeautifulSoup(code)
tag = soup.find('span', id='v4-27').contents[0]

price_string = re.search('(\d+,\d+)', tag).group(1)
precio_final = float(price_string.replace(',' , '.'))

print precio_final

Answer 1

.contents返回标记中的项目列表。例如：

>>> from bs4 import BeautifulSoup as BS
>>> soup = BS('<span class="foo"> bar baz <a href="http://foo.com">link</a></span>')
>>> print soup.find('span').contents
[u' bar baz ', <a href="http://foo.com">link</a>]

[0]用于访问列表.contents返回的第一个元素。在上面的示例中，它将返回bar baz

.group(1)返回正则表达式中的第二个（索引从0开始，记住）匹配的值。查看正则表达式，它会返回看起来像n1,n2的第二个数字。

group（）和contents []是什么意思？

1 个答案: