忽略嵌套列表中的某些元素

时间:2018-03-13 18:33:18

标签: python web-scraping list-comprehension nested-lists

我有以下列表

[["abc","cdf","efgh","x","hijk","y","z"],["xyz","qwerty","uiop","x","asdf","y","z"]]

我想要以下输出

[["abc","cdf","efgh","hijk"],["xyz","qwerty","uiop","asdf"]]

如何在此处执行拆分操作? PS:原始数据非常大。 原始数据:http://pasted.co/20f85ce5

1 个答案:

答案 0 :(得分:0)

我会为你的任务使用嵌套列表理解。

old = [["abc","cdf","efgh","x","hijk","y","z"],["xyz","qwerty","uiop","x","asdf","y","z"]]

droplist = ["x", "y", "z"]
new = [[item for item in sublist if item not in droplist] for sublist in old]
print(new)

请注意new列表的创建。外部列表comprehnsion考虑每个子列表。内部列表理解考虑单个字符串。

执行if item not in droplist时会发生过滤。

if item not in droplist可以替换为您可以编码的任何条件。例如:

new = [[item for item in sublist if len(item) >= 3] for sublist in old]

甚至:

def do_I_like_it(s):
    # Arbitrary code to decide if `s` is worth keeping
    return True
new = [[item for item in sublist if do_I_like_it(item)] for sublist in old]

如果要按子列表中的位置删除项目,请使用切片:

# Remove last 2 elements of each sublist
new = [sublist[:-2] for sublist in old]
assert new == [['abc', 'cdf', 'efgh', 'x', 'hijk'], ['xyz', 'qwerty', 'uiop', 'x', 'asdf']]