伙计们,我正在尝试从ebay上抓取一些数据,但一切正常,但问题是我想从数据中删除多余的文本,例如我得到
$ 10.99至$ 13.69
我只想得到
13.69美元
这是html代码
<span class="s-item__price">
"$10.99"
<span class="DEFAULT"> to </span>
"$13.69"</span>
这是我正在使用的python代码
find(class_='s-item__price').text
答案 0 :(得分:1)
这里
s = '$10.99 to $13.69'
val = s[s.rfind('$'):]
print(val)
输出
$13.69
答案 1 :(得分:0)
使用.stripped_strings
属性获取价格范围内的所有文本节点并获取最后一个。
https://www.crummy.com/software/BeautifulSoup/bs4/doc/#strings-and-stripped-strings
from bs4 import BeautifulSoup
soup = BeautifulSoup('''
<span class="s-item__price">
"$10.99"
<span class="DEFAULT"> to </span>
"$13.69"
</span>
''')
price_el = soup.select_one('.s-item__price')
strings = [*price_el.stripped_strings]
print(strings[-1])
输出:
$13.69
答案 2 :(得分:0)
您可以以此来捕获所有价格:
import re
def findAllPrices(content):
return re.findall(r'\$\d+\.\d+',content)
findAllPrices("""<span class="s-item__price">
"$10.99"
<span class="DEFAULT"> to </span>
"$13.69"</span>""")[-1]
'$13.69'
答案 3 :(得分:0)
您有字符串,因此可以使用字符串的函数来获取它
result = "$10.99 to $13.69".split(" to ")[-1]
print(result)
或者您可以找到class=DEFAULT
并获得next_sibling
from bs4 import BeautifulSoup as BS
data ='''<span class="s-item__price">
"$10.99"
<span class="DEFAULT"> to </span>
"$13.69"</span>'''
soup = BS(data, 'html.parser')
item = soup.find('span', class_="DEFAULT")
result = item.next_sibling
result = result.strip()
print(result)