在python

时间:2018-03-07 00:57:43

标签: python

我有一个小ml模型,基于它做出的预测,我计算其性能指标并将它们附加到列表中,如下所示:

results_to_save = [] 
results_to_save.append(('Filename:', required_filename,'Accuracy:',accuracy, 'Specificity:',specificity,'Precision',precision, 'Recall:',recall,'F-Score:',f_score)) 

with open('./metrics/results.txt', 'a') as outfile: 
    json.dump(results_to_save, outfile)   
    outfile.write("\n\n")    
    logger.info("SAVED METRICS ")    

因此,如果模型预测了三个文件,那么上面的代码块会被执行三次 ^(相当讨厌)的输出保存到txt文件,它看起来像这样:

[["Filename:", "ab", "Accuracy:", 0.6662763466042154, "Specificity:", 0.8047138047138047, "Precision", 0.7075630252100841, "Recall:", 0.5152998776009792, "F-Score:", 0.5963172804532577]]

[["Filename:", "abc", "Accuracy:", 0.9545746535743783, "Specificity:", 0.9743440233236151, "Precision", 0.5875, "Recall:", 0.6194398682042833, "F-Score:", 0.603047313552526]]

[["Filename:", "abcd", "Accuracy:", 0.8568113251334416, "Specificity:", 0.9985740767146728, "Precision", 0.9744245524296675, "Recall:", 0.23738317757009345, "F-Score:", 0.3817635270541082]]

因此,如果我想比较两个这样的文件,它变得非常耗时。

有没有办法将结果保存为结构化的表格形式,如R数据帧或其他东西,但是在Python中? 这样结果将以更易读的格式保存,如下所示:

filename | param1 | param2 | param3 | param4 
...         ...       ...       ...      ...
...         ...       ...       ...      ...

即每行都属于特定文件

谢谢你。

2 个答案:

答案 0 :(得分:1)

您可以使用Pandas或csv模块,它将为您执行此操作。事实上,我建议你使用csv 更一般的答案是,你真正想要做的是在将字符串写入文件之前格式化字符串。像这样:

with open(output_path, "w") as fh:
    for result in results:
        parts = [part.replace("|", "\|") for part in results]
        line = "|".join(parts)
        fh.write(line)  

这将获取每个结果,替换每个管道字符的每个转义(总是让你逃避分隔字符),将部分连接成一行,然后将其写入文件。

答案 1 :(得分:1)

或者您可以在没有任何模块的情况下写入tab-separated values(TSV)格式。例如,像这样:

results_to_save = [] 
results_to_save.append(('Filename:', 'ab', 'Accuracy:', 0.6662763466042154, 'Specificity:', 0.8047138047138047, 'Precision', 0.7075630252100841, 'Recall:', 0.5152998776009792, 'F-Score:', 0.5963172804532577))
results_to_save.append(('Filename:', 'abc', 'Accuracy:', 0.9545746535743783, 'Specificity:', 0.9743440233236151, 'Precision', 0.5875, 'Recall:', 0.6194398682042833, 'F-Score:', 0.603047313552526))
results_to_save.append(('Filename:', 'abcd', 'Accuracy:', 0.8568113251334416, 'Specificity:', 0.9985740767146728, 'Precision', 0.9744245524296675, 'Recall:', 0.23738317757009345, 'F-Score:', 0.3817635270541082))

with open('./metrics/results.tsv', 'a') as outfile: 
    # Print the column headers
    first = results_to_save[0]
    keys = [first[i].rstrip(':') for i in range(0,len(first),2)]
    outfile.write('\t'.join(keys)+'\n')
    # Print data for each row
    for row in results_to_save:
        values = [row[i] for i in range(1,len(row),2)]
        outfile.write('\t'.join(map(str,values))+'\n')        
    logger.info("SAVED METRICS ")