我想拆分字典if the value is empty
列表并创建新的列表列表。
输入:
[{'k':'a'},{'k':'b'},{'k':''},{'k':'d'},{'k':''},{'k':'f'},{'k':'g'}]
输出:
[[{'k': 'a'}, {'k': 'b'}, {'k': ''}], [{'k': 'd'}, {'k': ''}], [{'k': 'f'}, {'k': 'g'}]]
我尝试过使用循环,if和它的工作正常。
sub_header_list = [{'k':'a'},{'k':'b'},{'k':''},{'k':'d'},{'k':''},{'k':'f'},{'k':'g'}]
index_data = [] ; data_list = []
for i in sub_header_list:
index_data.append(i)
if i['k'] == '':
data_list.append(index_data)
index_data = []
print(data_list+[index_data])
[[{'k': 'a'}, {'k': 'b'}, {'k': ''}], [{'k': 'd'}, {'k': ''}], [{'k': 'f'}, {'k': 'g'}]]
是否有任何pythonic方式来实现相同的目的,我的意思是使用内置函数或其他东西?
答案 0 :(得分:2)
您可以使用groupby:
from itertools import groupby, chain
l = [{'k':'a'},{'k':'b'},{'k':''},{'k':'d'},{'k':''},{'k':'f'},{'k':'g'}]
grps = groupby(l, lambda d: d["k"] == "")
print([list(chain(*(v, next(grps, [[], []])[1]))) for k, v in grps if k])
输出:
[[{'k': 'a'}, {'k': 'b'}, {'k': ''}], [{'k': 'd'}, {'k': ''}], [{'k': 'f'}, {'k': 'g'}]]
或使用生成器功能:
def grp(lst, ):
temp = []
for dct in lst:
# would catch None, 0, for just empty strings use if dct["k"] == "".
if not dct["k"]:
temp.append(dct)
yield temp
temp = []
else:
temp.append(dct)
yield temp
它为您提供相同的输出:
In [9]: list(grp(l))
Out[9]:
[[{'k': 'a'}, {'k': 'b'}, {'k': ''}],
[{'k': 'd'}, {'k': ''}],
[{'k': 'f'}, {'k': 'g'}]]
生成器功能是迄今为止最有效的方法。
In [8]: l = [{'k':'a'}, {'k':'b'}, {'k':''}, {'k':'d'}, {'k':''}, {'k':'f'}, {'k':'g'}]
In [9]: l = [dict(choice(l)) for _ in range(100000)]
In [10]: timeit list(grp(l))
10 loops, best of 3: 19.5 ms per loop
In [11]: %%timeit
index_list = [i + 1 for i, x in enumerate(l) if x == {'k': ''}]
[l[i:j] for i, j in zip([0] + index_list, index_list + [len(l)])]
....:
10 loops, best of 3: 31.6 ms per loop
In [12]: %%timeit grps = groupby(l, lambda d: d["k"] == "")
[list(chain(*(v, next(grps, [[], []])[1]))) for k, v in grps if k]
....:
10 loops, best of 3: 40 ms per loop
答案 1 :(得分:2)
这是另一种Pythonic方式:
>>> d = [{'k':'a'}, {'k':'b'}, {'k':''}, {'k':'d'}, {'k':''}, {'k':'f'}, {'k':'g'}]
>>> index_list = [i + 1 for i, x in enumerate(d) if x == {'k': ''}]
>>> [d[i:j] for i, j in zip([0] + index_list, index_list + [len(d)])]
[[{'k': 'a'}, {'k': 'b'}, {'k': ''}], [{'k': 'd'}, {'k': ''}], [{'k': 'f'}, {'k': 'g'}]]