Question

鉴于 clusterLists 是一个包含元组的列表：

clusterLists = [[(1.182, "monthly_1"), (1.181, '0_Retrace_3_H1')]
                , [(1.1502, '1_Retrace_5_M15'), (1.1493, '1_Retrace_5_M15')]]

我可以按如下方式过滤列表：

for clust in clusterLists:
    if not sum([x[1].endswith("_M15") for x in clust]) >= 2:
        if not sum([x[1].endswith("_H1") for x in clust]) >= 2:
            print(clust)

输出：

[(1.182, 'monthly_1'), (1.181, '0_Retrace_3_H1')]

我如何以更 Pythonic 的方式执行此条件检查并获得相同的输出。即检查 if not sum([x[1].endswith("_H6") 等而不为每个添加一行。

Answer 1

首先，它可能会帮助您知道，虽然它不会消除任何行，但如果您使用 and 与大量嵌套的 {{1} 相比，您可以节省大量不必要的缩进}s：

if

这样，如果您添加条件，您的内部代码就不必越来越多地缩进了。

但要真正回答您的问题，您可以定义如下函数，然后添加条件只需将条目添加到 for cluster in clusters: if ( not sum([x[1].endswith("_M15") for x in cluster]) >= 2 and not sum([x[1].endswith("_H1") for x in cluster]) >= 2 ): print(cluster) 字典中，而无需不必要地重复：

conditions

请注意，与您的原始答案一样，这将遍历每个集群两次（或更多，当您添加条件时）。如果您最终得到一个很长的集群列表和/或很多条件，这可能会成为一个性能问题。

Answer 2

您可以添加一个函数并重复使用您的代码：

def check_endings(cluster, ending, limit):
    return not sum([x[1].endswith(ending) for x in clust]) >= limit

clusters_to_print = [
    cluster for cluster in clusterLists
    if check_endings(cluster, ending="_M15", limit=2)
    and check_endings(cluster, ending="_H1", limit=2)
]

for cluster in clusters_to_print:
    print(cluster)

从技术上讲，即使您不计算函数，这也会使用更多行，但这仅仅是因为为了便于阅读，列表理解被拆分为它们。

Answer 3

如果您总是将最后一个 _<value> 用作条件，我可能只使用 collections.Counter 将所有和作为键来获取，然后您就可以只需一次检查它们：

In [187]: from collections import Counter
     ...: conditions = ["M15", "H1"]
     ...: for clust in clusterLists:
     ...:     c = Counter()
     ...:     for val, name in clust:
     ...:         c[name.split("_")[-1]] += val
     ...:     if all(c[cond] < 2 for cond in conditions):
     ...:         print(clust)

[(1.182, 'monthly_1'), (1.181, '0_Retrace_3_H1')]

如果逻辑因字段而异，您也可以通过在列表中指定它来轻松扩展它：

In [188]: from collections import Counter
     ...: conditions = [("M15", lambda x: x < 2), ("H1", lambda x: x < 2)]
     ...: for clust in clusterLists:
     ...:     c = Counter()
     ...:     for val, name in clust:
     ...:         c[name.split("_")[-1]] += val
     ...:     if all(cond(c[col]) for col, cond in conditions):
     ...:         print(clust)

[(1.182, 'monthly_1'), (1.181, '0_Retrace_3_H1')]

编辑：我现在看到您实际上是在尝试计算出现次数，而不是对相关值求和。只需将 += val 与 += 1 交换或使用 Counter 将列表作为输入的功能，在这里很容易解决：

In [5]: conditions = ["M15", "H1"]
   ...: for clust in clusterLists:
   ...:    c = Counter([name.split("_")[-1] for _, name in clust])
   ...:    if all(c[cond] < 2 for cond in conditions):
   ...:        print(clust)
   ...:
[(1.182, 'monthly_1'), (1.181, '0_Retrace_3_H1')]

Answer 4

也许创建一个条件数组。

cluster_list = [
    [(1.182, "monthly_1"), (1.181, '0_Retrace_3_H1')],
    [(1.1502, '1_Retrace_5_M15'), (1.1493, '1_Retrace_5_M15')]
]
conditions = ["_M15", "_H1", "_H6"]

for cluster in cluster_list:
    matches = [[x[1].endswith(condition) for x in cluster] for condition in conditions]
    if all([not sum(i) >= 2 for i in matches]):
        print(cluster)

# [(1.182, 'monthly_1'), (1.181, '0_Retrace_3_H1')]

实现此条件过滤器的 Pythonic 方式

4 个答案: