Question

我创建了一个函数来接收包含这些数据的文本文件：

2012-01-01  09:00   Angel   Men's Clothing  214.05  Amex
2012-01-01  09:00   Ben Women's Clothing    153.57  Visa
2012-01-01  09:00   Charlie Music   66.08   Cash

并将其转换为元组列表：

Code : myList = [tuple(j.split("\t")) for j in stringX.split("\n")]
Result:
[('2012-01-01', '09:00', 'Angel', "Men's Clothing", '214.05', 'Amex'), 
('2012-01-01', '09:00', 'Ben', "Women's Clothing", '153.57', 'Visa'), 
('2012-01-01', '09:00', 'Charlie', 'Music', '66.08', 'Cash')]

进一步将其转换为：

Code: nameList = [(float(item[4]),item[2])for item in myList]
Result: [(214.05, 'Angel'), (153.57, 'Ben'), (66.08, 'Charlie')]

使用那个小尺寸的文本文件，它运行得很好。但我必须转换超过200 MB的大文本文件，超过100万行。它设法转换为元组列表，但它不会进一步转换为较小的元组列表，如上所示。

当我使用Big File运行程序时，它给出了错误：

File "C:\Users\Charlie\Desktop\PYC\PYTHON ASSIGNMENT\test3.py", line 34, in <listcomp>
nameList = [(float(item[4]),item[2])for item in myList]
IndexError: tuple index out of range

Answer 1

你的元组有一个空条目，这就是为什么你得到“IndexError：元组索引超出范围”

您可以添加if条件来验证元组中是否包含任何值。

<强> EX：

myList = [('2012-01-01', '09:00', 'Angel', "Men's Clothing", '214.05', 'Amex'), 
('2012-01-01', '09:00', 'Ben', "Women's Clothing", '153.57', 'Visa'),
(), 
('2012-01-01', '09:00', 'Charlie', 'Music', '66.08', 'Cash')]


nameList = [(float(item[4]),item[2])for item in myList if item]
print nameList
[(214.05, 'Angel'), (153.57, 'Ben'), (66.08, 'Charlie')]

我的元组出了什么问题？

1 个答案: