去除具有单引号的列表的列表

时间:2019-11-29 21:42:04

标签: python

我正在创建回归模型的验证损失列表,它们的格式如下:

=SPARKLINE({SPLIT(REPT("7,";FLOOR((INT(G5)-INT(F5))/7));",")\MOD(INT(G5)-INT(F5);7)};
 {"charttype"\"bar";"color1"\"white";"color2"\"lightgrey"})

如何将它们放在简单列表中以计算平均值/偏差?

5 个答案:

答案 0 :(得分:0)

使用此:

import ast
[ast.literal_eval(i)[0] for i in mylist]

带有numpy的完整示例:

import numpy as np, ast

mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]', 
          '[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]', 
          '[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]', 
          '[89.62320741326292]']

clean_list = np.array([ast.literal_eval(i)[0] for i in mylist])
#array([72.49191836, 83.83327374, 72.48327326, 66.98897377, 71.13875892,
#       64.38201065, 73.28287317, 79.71193158, 79.55777844, 89.62320741])

clean_list.mean()
75.34

没有numpy的完整示例:

mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]', 
          '[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]', 
          '[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]', 
          '[89.62320741326292]']

clean_list = [ast.literal_eval(i)[0] for i in mylist]
#[72.49191836, 83.83327374, 72.48327326, 66.98897377, 71.13875892,
#       64.38201065, 73.28287317, 79.71193158, 79.55777844, 89.62320741]

average_list = sum(clean_list) / len(clean_list)
75.34

答案 1 :(得分:0)

不使用eval:

import numpy as np

mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]', 
 '[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]', 
 '[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]', 
 '[89.62320741326292]']

mylist = np.array([float(i[1:-1]) for i in mylist])

mylist.mean()

输出:

75.34939993083671

答案 2 :(得分:0)

您可以使用正则表达式删除[和],然后将值转换为float:

import regex as re

mylist = [
    "[72.49191836014535]",
    "[83.83327374257702]",
    "[72.48327325617225]",
    "[66.98897377186994]",
    "[71.13875892170039]",
    "[64.3820106481657]",
    "[73.28287317220448]",
    "[79.7119315804787]",
    "[79.55777844179023]",
    "[89.62320741326292]",
]

data = [float(re.sub(r"[\[\]]", "", v)) for v in mylist]

输出:

[72.49191836014535,
 83.83327374257702,
 72.48327325617225,
 66.98897377186994,
 71.13875892170039,
 64.3820106481657,
 73.28287317220448,
 79.7119315804787,
 79.55777844179023,
 89.62320741326292]

答案 3 :(得分:-2)

另一种解决方案,与上述类似:

from statistics import mean 

mylist = ['[72.49191836014535]', '[83.83327374257702]', '[72.48327325617225]', 
          '[66.98897377186994]', '[71.13875892170039]', '[64.3820106481657]', 
          '[73.28287317220448]', '[79.7119315804787]', '[79.55777844179023]', 
          '[89.62320741326292]']

list = [ast.literal_eval(i)[0] for i in mylist]
print(mean(list))

结果:

75.3493999308367

答案 4 :(得分:-4)

在上述的Makis和Eric Day进行扩展:

import statistics

data = [ eval(a)[0] for a in mylist ]

# collate stats wanted for printing
stats = [sum, statistics.mean, statistics.stdev, statistics.variance]
for stat in stats:
    print(stat.___name____, stat(data))