Question

我有像这样的python数据结构

dl= [{'plat': 'unix',   'val':['', '',   '1ju', '', '',   '202', '',   '']},
     {'plat': 'Ios',    'val':['', '',   '',    '', 'Ty', '',    'Jk', '']},
     {'plat': 'NT',     'val':['', '',   1,     '', '' ,  '202', '',   '']},
     {'plat': 'centOs', 'val':['', '',   '',    '', '',   '202', '',   '']},
     {'plat': 'ubuntu', 'val':['', 'KL', '1',   '', '',   '',    '',   '9i0']}]
                                ^                ^
                                |                |
                                \                /
                                   Delete these

我正在尝试删除列表'val'中的位置，其中每个列表中同一列中的值为空。例如，列表中的位置0和3（dl）。我想得到这样的输出：

Output= [{'plat': 'unix',   'val':['',   '1ju', '',   '202', '',   '']},
         {'plat': 'Ios',    'val':['',   '',    'Ty', '',    'Jk', '']},
         {'plat': 'NT',     'val':['',   1,     '' ,  '202', '',   '']},
         {'plat': 'centOs', 'val':['',   '',    '',   '202', '',   '']},
         {'plat': 'ubuntu', 'val':['KL', '1',   '',   '',    '',   '9i0']}]

Answer 1

让我们分两步完成。首先，找到要删除的索引：

lists = [e['val'] for e in dl]
idx_to_remove = [i for i, elem in enumerate(map(any, zip(*lists))) if not elem]

其次，让我们过滤原始列表：

for l in lists:
    l[:] = [elem for i, elem in enumerate(l) if i not in idx_to_remove]

结果：

>>> pprint.pprint(dl)
[{'plat': 'unix', 'val': ['', '1ju', '', '202', '', '']},
 {'plat': 'Ios', 'val': ['', '', 'Ty', '', 'Jk', '']},
 {'plat': 'NT', 'val': ['', 1, '', '202', '', '']},
 {'plat': 'centOs', 'val': ['', '', '', '202', '', '']},
 {'plat': 'ubuntu', 'val': ['KL', '1', '', '', '', '9i0']}]

Answer 2

dl= [{'plat': 'unix',   'val':['', '', '1ju', '', '', '202',  '',   '']},
     {'plat': 'Ios',    'val':['', '',  '',   '', 'Ty', '',     'Jk', '']},
     {'plat': 'NT',     'val':['', '',   1,   '', '' , '202', '',   '']},
     {'plat': 'centOs', 'val':['', '',  '',   '', '',  '202', '',   '']},
     {'plat': 'ubuntu', 'val':['', 'KL','1',  '', '',   '',   '',   '9i0']}]

def empty_indices(lst):
  return {i for i,v in enumerate(lst) if not v}

# Need to special-case the first one to initialize the set of "emtpy" indices.
remove_idx = empty_indices(dl[0]['val'])
# Here we do the first one twice.  We could use itertools.islice but it's 
# probably not worth the miniscule speedup.
for item in dl:
  remove_idx &= empty_indices(item['val'])

for item in dl:
    item['val'] = [k for i,k in enumerate(item['val']) if i not in remove_idx]

# print the results.
import pprint
pprint.pprint(dl)

Answer 3

from itertools import izip
from operator import itemgetter

# create an iterator over columns
columns = izip(*(d['val'] for d in dl))

# make function keeps non-empty columns
keepfunc = itemgetter(*(i for i, c in enumerate(columns) if any(c)))

# apply function to each list
for d in dl:
    d['val'] = list(keepfunc(d['val']))

Answer 4

另一种可能的解决方案（效率不高但很好......）。 zip()真的被低估了......

# extract the values as a list of list
vals = [item["val"] for item in dl]
# transpose lines to columns
cols = map(list, zip(*lines))
# filter out empty columns
cols = [c for c in cols if filter(None, c)]
# retranspose columns to lines
lines = map(list, zip(*cols))
# build the new dict
output = [
    dict(plat=item["plat"], val=line) for item, line in zip(dl, lines)
    ]

从列表中删除空元素

4 个答案: