我正在尝试将以下所示形式的数据加载到数据框中。
popSize: 1000
numSurvivors: 0
tournamentSize: 10
probMutation: 0.1
probCrossover: 0.9
numIters: 100
Accuracy: 96.84
Error Rate: 3.16
Not Classified: 0.00
Total time: 5.367
popSize: 1000
numSurvivors: 0
tournamentSize: 10
probMutation: 0.1
probCrossover: 0.9
numIters: 100
Accuracy: 96.84
Error Rate: 3.16
Not Classified: 0.00
Total time: 4.472
popSize: 1000
numSurvivors: 0
tournamentSize: 10
probMutation: 0.1
probCrossover: 0.9
numIters: 100
Accuracy: 92.11
Error Rate: 7.89
Not Classified: 0.00
Total time: 4.46
数据代表算法的多次执行。 是否有一种方法可以使用最后4个值的平均结果将数据加载为单行?
答案 0 :(得分:1)
这是一种使用itertools.groupby()
和pandas
将数据整理到数据帧中的方法:
from itertools import groupby
import pandas as pd
with open('test.txt', 'r') as f:
chunks = [list(group) for k, group in groupby(f.readlines(), lambda x: x=='\n') if not k]
chunks = [dict([tuple(i.strip().split(': ')) for i in chunk]) for chunk in chunks]
df = pd.DataFrame(chunks).astype(float)
返回:
Accuracy Error Rate Not Classified Total time numIters numSurvivors popSize \
0 96.84 3.16 0.00 5.367 100 0 1000
1 96.84 3.16 0.00 4.472 100 0 1000
2 92.11 7.89 0.00 4.46 100 0 1000
probCrossover probMutation tournamentSize
0 0.9 0.1 10
1 0.9 0.1 10
2 0.9 0.1 10
您可以轻松地计算出平均值,如下所示:
df[['Accuracy','Error Rate','Not Classified','Total time']].mean()
返回:
Accuracy 95.263333
Error Rate 4.736667
Not Classified 0.000000
Total time 4.766333
dtype: float64
答案 1 :(得分:0)
(Round(case when ret2 <> 0 or originalretail <> 0
then case when ret2 > 0 then (ret2- retone)/ret2
when originalretail > 0 then (originalretail-retone)/originalretail
else null end end,2))*100 as [Savings %]