现在我有一个如下所示的python列表:
['', '2015-10-21 00:00:03', 'jp/ja/fedex/inet/label/international' ]
[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic' ]
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home' ]
['87878', '', 'cn/zhs/fedex/inet/label/international']
['', '2015-10-21 00:00:18', '' ]
[5454, '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking' ]
['', '2015-10-21 00:00:21', 'sg/en/fedex/inet/label/international' ]
此2D列表包含3列和超过10,000行。
正如您所看到的,某些行在[0]
处缺少元素,而有些行在[1]
处缺少元素,有些在[2]
处缺少元素。
有些人有三个要素。
我需要删除所有那些没有三个元素的行。
话虽如此,只要一行错过一个元素,就需要删除它。
因此,对于上面的列表,需要删除row[0][3][4][5][6]
。
执行删除功能后,列表应如下所示:
[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic' ]
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home' ]
我正在考虑这个问题:
for i in range(len(D)): //D is the name of my list
if D[i][0] =='' or D[i][1]=='' or D[i][2] =='':
del D[i]
但这不起作用,因为当你截断列表时,len(D)
正在改变,你将无法遍历整个列表。
我也想过这个:
for item in D:
if item[0]=='' or item[1]=='' or item[2] =='':
del item
这根本不存在。
如果你想出点什么,我真的很感激。
答案 0 :(得分:3)
我会使用D = filter(all, D)
或D = filter(lambda x: '' not in x, D)
,具体取决于您对" 空"的确切定义。
考虑这个程序:
from pprint import pprint
D = [
['', '2015-10-21 00:00:03', 'jp/ja/fedex/inet/label/international' ],
[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic' ],
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home' ],
['87878', '', 'cn/zhs/fedex/inet/label/international'],
['', '2015-10-21 00:00:18', '' ],
[5454, '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking' ],
['', '2015-10-21 00:00:21', 'sg/en/fedex/inet/label/international' ],
]
D2 = filter(all, D)
D3 = filter(lambda x: '' not in x, D)
assert D2 == D3
pprint(D2)
pprint(D3)
答案 1 :(得分:1)
对于记录,如果您将样本数据显示为我可以复制和粘贴的实际列表,那将会很有帮助。
all
函数仅在其参数的所有元素都为true时才返回True。例如:
>>> all([1, 2, 3])
True
>>> all(['', 2, 3])
False
>>> all([1, 2, 0])
False
通过在列表理解中迭代列表列表,可以相对轻松地生成您想要的内容。
tlist = [
['', '2015-10-21 00:00:03', 'jp/ja/fedex/inet/label/international' ],
[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic' ],
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home' ],
['87878', '', 'cn/zhs/fedex/inet/label/international'],
['', '2015-10-21 00:00:18', '' ],
[5454, '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking' ],
['', '2015-10-21 00:00:21', 'sg/en/fedex/inet/label/international' ]]
result = [r for r in tlist if all(x for x in r)]
result
现在将包含
[[398798, '2015-10-21 00:00:10', 'us/en/fedex/inet/label/domestic'],
[878787, '2015-10-21 00:00:16', 'us/en/fedex/fedexcares/home'],
[5454, '2015-10-21 00:00:19', 'us/en/fedex/sameday/main tracking']]