将值的分布与特定值进行比较

时间:2020-06-11 19:09:48

标签: python

我在变量prop_list_test2中存储了1000个分布在0.0和1.0之间的459个浮点数

我还有1000个值来比较每个分布以存储为p_95_null。对于每个分布,我试图找到> = p_95_null对应分布的比例。因此,对于prop_list_test2中的第一个分布,我想将其与p_95_null中的第一个值进行比较,依此类推,直到得到1000个比例pv的数组。

这是我的尝试,尽管这是一种非常混乱且非Python的方式

pv = []
index = 0

comp = p_95_null[index] #What we're comparing it to
truth_list = []

while index<len(p_95_null):
    test_list = [] #Which distribution from prop_list_test2 we are using
    truth_list = []    
    for i in prop_list_test2[index]:
        test_list.append(i)

    for i in test_list:
        if i >= comp:
            truth_list.append(True)
            test_list = []
            index+=1
        elif i < comp:
            truth_list.append(False)
            test_list = []
            index+=1

    pv.append((sum(truth_list)/len(truth_list)))


print(pv)

我的输出是[0.06318082788671024, 0.058823529411764705, 0.058823529411764705]。某些功能无法正常运行,因为我期望在pv中有1000个值,但是我只能得到3。我的代码的哪一部分导致了此问题,我似乎无法弄清楚。

1 个答案:

答案 0 :(得分:1)

这是执行此操作的Python方法:

pv = [sum(v > p_95 for v in values)/len(values) 
      for values, p_95 in zip(prop_list_test2, p_95_null)]

说明:

  • 总体而言,此(pv = [... for ... in ...])是列表理解-Python中的一种语法,有助于映射序列
  • zip(...)将浮点值列表与p95阈值配对,因此更容易迭代而不会弄乱索引
  • 左边部分与代码的最后一行几乎相同。唯一的区别是内部for循环被生成器替换,然后被传递到sum

代码审查:

pv = []
index = 0

comp = p_95_null[index] #What we're comparing it to
truth_list = []

# nothing is wrong with this line, but it would be more appropriate to:
# for index, test_list in enumerate(prop_list_test2):
while index<len(p_95_null):
    test_list = [] #Which distribution from prop_list_test2 we are using
    truth_list = []

    for i in prop_list_test2[index]:
        test_list.append(i)

    # This is why it fails: index is used by while as prop_list_test index,
    # but here it is incremented for values in each sublist
    # instead, `index+=1` should be moved out of the for loop
    for i in test_list:
        if i >= comp:
            truth_list.append(True)
            test_list = []
            index+=1
        elif i < comp:
            truth_list.append(False)
            test_list = []
            index+=1

    pv.append((sum(truth_list)/len(truth_list)))