我刚收到一封带有lxml库的解释Python Web Scraping很好的电子邮件,所以我想测试一下结果。我试图在“有关zip的信息(标题,价格,标签,total_platforms):”中打印某些内容,但无法在此函数内甚至类型(对象)中打印。 Resp词典不会保存数据,Pycharm中的调试器只是跳过了这一块。我尝试打印输出,但显示为空。有人可以向我解释这种情况吗?
import requests
import lxml.html
html = requests.get('https://store.steampowered.com/explore/new/')
doc = lxml.html.fromstring(html.content)
new_releases = doc.xpath('//div[@id="tab_newreleases_content"]')[0]
titles = new_releases.xpath('.//div[@class="tab_item_name"]/text()')
prices = new_releases.xpath('.//div[@class="discount_final_price"]/text()')
tags = [tag.text_content() for tag in new_releases.xpath('.//div[@class="tab_item_top_tags"]')]
tags = [tag.split(', ') for tag in tags]
platforms_div = new_releases.xpath('.//div[@class="tab_item_details"]')
total_platforms = []
for game in platforms_div:
temp = game.xpath('.//span[contains(@class, "platform_img")]')
platforms = [t.get('class').split(' ')[-1] for t in temp]
if 'hmd_separator' in platforms:
platforms.remove('hmd_separator')
total_platforms.append(platforms)
output = []
for info in zip(titles,prices, tags, total_platforms):
resp = {}
resp['title'] = info[0]
resp['price'] = info[1]
resp['tags'] = info[2]
resp['platforms'] = info[3]
output.append(resp)
src:https://medium.freecodecamp.org/an-intro-to-web-scraping-with-lxml-and-python-b02b7a3f3098
答案 0 :(得分:0)
这是因为/// <summary>
/// T can only be an int or double or will throw an exception on construction.
/// </summary>
/// <typeparam name="T">Must be int or double.</typeparam>
public class A<T>
{
public A()
{
if (!(property is int || property is double))
throw new Exception("A can only work with int and double");
}
public T property { get; set; }
}
对象是一个迭代器。
循环迭代器的所有元素时,该对象将为空。因此,您需要将分配替换为zip
并使用resp
。
print(info)
输出:
for info in zip(titles,prices, tags, total_platforms):
print(info)