我有这段HTML(数字不同):
<span class="ng-binding">
<b>Total:</b>
68.71€ (459 items)
</span>
除此之外,我想提取68.71€ (459 items)
到目前为止,我尝试使用这段代码,只是将xpath复制到Google Chrome上面显示的span类:
import urllib.request
from lxml import html
import os
ids = ["ftpstorage1-730",
"ftpstorage2-730",
"ftpstorage3-730"]
for id in ids:
url = 'http://steam.tools/itemvalue/#/'+id
with urllib.request.urlopen(url) as response:
site = response.read()
tree = html.fromstring(site)
data = tree.xpath('//*[@id="container"]/div[5]/span[1]/text()')
print(data)
从理论上说这应该有效,但它不会成功,我所得到的只有data
:
[" {{(items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'price':'count':
e}}\n\t\t\t\t({{items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'count
[" {{(items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'price':'count':
e}}\n\t\t\t\t({{items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'count
[" {{(items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'price':'count':
e}}\n\t\t\t\t({{items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'count
知道我做错了什么?
它与生成的数字有关,而不是静态的吗?
如果是这样,我怎么能提取数字呢?
答案 0 :(得分:2)
您在控制台上看到的内容是带有AngularJS绑定占位符的未呈现的HTML 。您需要一个真正的浏览器来执行javascript,并让Angular将实际值放入占位符。
或者,如果您更深入地了解如何检索和计算总价格,您可以在不使用真实浏览器的情况下解决问题。向提供http://item-value10.appspot.com/ParseInv
和id
参数的app
端点发出GET请求,解析JSON响应并计算项目计入帐户的价格:
import requests
template_url = "http://item-value10.appspot.com/ParseInv"
ids = ["ftpstorage1-730", "ftpstorage2-730", "ftpstorage3-730"]
for id in ids:
with requests.Session() as session:
session.get('http://steam.tools/itemvalue/#/' + id)
storage, app = id.split("-")
url = template_url.format(storage=storage, app=app)
response = session.get(url, params={
"id": storage,
"app": app
}, headers={
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36",
"Referer": "http://steam.tools/itemvalue/"
})
data = response.json()
total = sum(float(item["price"]) * int(item["count"]) for item in data["items"])
print(total)
打印:
20.439999999999998
78.16
0