XPATH到Excel因值拆包而失败

时间:2018-09-21 21:13:29

标签: python excel

我正在尝试编写一个Python脚本,该脚本将从网站发送文本并将其放入excel。我可以请求数据,但是将其转换为excel给我带来了一些困难。

from lxml import html
import requests
import xlsxwriter
import datetime

now = datetime.datetime.today().strftime('%Y-%m-%d')

page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.content)

#This will create a list of buyers
buyers = tree.xpath('//div[@title="buyer-name"]/text()')

#This will create a list of prices
prices = tree.xpath('//span[@class="item-price"]/text()')

print( 'Buyers: ', buyers)
print( 'Prices: ', prices)

expenses = (buyers, prices)

#creating excel sheet
workbook = xlsxwriter.Workbook('sales' + str(now) + '.xlsx')
worksheet = workbook.add_worksheet()

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0


#write data to excel
for item, cost in (expenses):
    worksheet.write(row, col,     item)
    worksheet.write(row, col + 1, cost)
    row += 1


workbook.close()

返回:追溯(最近一次通话过去):   文件“ wRequests.py”,第32行,在     对于项目,成本(费用): ValueError:太多值无法解包(预期2)

如何解压缩这些值并将其正确加载到excel中?

1 个答案:

答案 0 :(得分:0)

尝试以下方法,它将完成您的工作:

from lxml import html
import requests
import xlsxwriter
import datetime


now = datetime.datetime.today().strftime('%Y-%m-%d')

page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.content)

#This will create a list of buyers
buyers = tree.xpath('//div[@title="buyer-name"]/text()')

#This will create a list of prices
prices = tree.xpath('//span[@class="item-price"]/text()')

print( 'Buyers: ', buyers)
print( 'Prices: ', prices)

#creating excel sheet
workbook = xlsxwriter.Workbook('sales' + str(now) + '.xlsx')
worksheet = workbook.add_worksheet()

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0

#write data to excel
for index, buyer in enumerate(buyers):
    worksheet.write(row, col,     buyer)
    worksheet.write(row, col + 1, prices[index])
    row += 1

workbook.close()

说明

现在,您的expenses变量是一个包含两个列表的组合。它的结构是

(
    ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes'],
    ['$29.95', '$8.37', '$15.26']
)

for循环的工作方式不同于您在该数据上使用循环的方式。它的基本语法是:

for item in collection:
    print(item)

但是您使用它的方式是

for item, cost in (expenses):
    print(item, cost)

以这种方式运行循环时,不会分别得到itemcost。从expenses组合中只会得到一个元素,一次不会多个。基本上您的语法是错误的。

现在,如果您像这样运行它

for single_item in (expenses):
    print(single_item)

输出将是:

['Carson Busses', 'Earl E. Byrd', 'Patty Cakes'] # output for first iteration
['$29.95', '$8.37', '$15.26'] # output for second iteration

您会看到我们没有将itemcost在一起。首先,我们得到的第一个列表是您的item,而在第二次迭代中,我们得到的第二个列表是cost

检查了buyersprices变量后,我发现它们具有相同数量的项目,并且其中一个list对应的项目与其他list位于与他自己的index相同。因此,我可以随意选择一个列表,使用索引对其进行迭代,然后使用该索引在其他列表中找到它的对应项。它是如此简单。喜欢:

for index, buyer in enumerate(buyers):
    print(buyer, prices[index])

然后它将输出为

Carson Busses $29.95
Earl E. Byrd $8.37
Patty Cakes $15.26

我也可以将for循环写为

for index, buyer in enumerate(buyers):
    print(buyers[index], prices[index])

将给出相同的输出。请注意,Python的内置enumerate函数为列表的每个项目都返回indexvalue

如果您还有customers之类的其他数据,则只需引用index即可找到客户。如下所示:

for index, buyer in enumerate(buyers):
    print(buyer, prices[index], customers[index])
    # or you can also do this
    print(buyers[index], prices[index], customers[index])

希望这会有所帮助。