我试图重构之前非常手动的代码,并且涉及为我创建的每个新数据框设置索引,以基本上创建所需的输出:
f1 precision recall
A 0.600315956 0.72243346 0.513513514
B 0.096692112 0.826086957 0.051351351
C 0.085642317 0.62962963 0.045945946
D 0.108641975 0.628571429 0.059459459
这是我目前的代码:
summaryDF = pd.DataFrame().set_index(['A','B','C','D'])
def evaluation(trueLabels, evalLabels):
precision = precision_score(trueLabels, evalLabels)
recall = precision_score(trueLabels, evalLabels)
f1 = precision_score(trueLabels, evalLabels)
accuracy = accuracy_score(trueLabels, evalLabels)
data = {'precision': precision,
'recall': recall,
'f1': f1}
DF = pd.DataFrame(data)
summaryDF.concat(DF,ignore_index=True)
results = [y_randpred,y_cat_random_to_binary,y_cat_random_to_binary_threshold,y_closed_random_to_binary]
for result in results:
evaluation(y_true_claim, result)
这是我的错误跟踪:
Traceback (most recent call last):
File "/Users/dhruv/Documents/bla/bla/src/main/bla.py", line 419, in <module>
summaryDF = pd.DataFrame().set_index(['A','B','C','D'])
File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 2607, in set_index
level = frame[col].values
File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1797, in __getitem__
return self._getitem_column(key)
File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/frame.py", line 1804, in _getitem_column
return self._get_item_cache(key)
File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/generic.py", line 1084, in _get_item_cache
values = self._data.get(item)
File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/internals.py", line 2851, in get
loc = self.items.get_loc(item)
File "/Users/dhruv/anaconda/lib/python2.7/site-packages/pandas/core/index.py", line 1572, in get_loc
return self._engine.get_loc(_values_from_object(key))
File "pandas/index.pyx", line 134, in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)
File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)
File "pandas/hashtable.pyx", line 686, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12280)
File "pandas/hashtable.pyx", line 694, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12231)
KeyError: 'A'
知道我做错了吗?
答案 0 :(得分:0)
我解决了我的问题。
使用this answer,我的代码变为:
summaryDF = pd.DataFrame(columns=('precision','recall','f1'))
def evaluation(trueLabels, evalLabels):
global summaryDF
precision = precision_score(trueLabels, evalLabels)
recall = recall_score(trueLabels, evalLabels)
f1 = f1_score(trueLabels, evalLabels)
data = {'precision': [precision],
'recall': [recall],
'f1': [f1]
}
DF = pd.DataFrame(data)
summaryDF = pd.concat([summaryDF,DF])
results = [y_randpred,
y_cat_random_to_binary,
y_cat_random_to_binary_threshold,
y_closed_random_to_binary,
y_closedCat_random_to_binary_threshold]
for result in results:
evaluation(y_true_claim, result)
summaryDF.index=list(['A',
'B',
'C',
'D',
'E'])
关键方面是我需要将元素放在方括号中以进行精确,调用和F1,然后通过summaryDF.index
而不是set_index
方法设置索引。
所以我只是追加然后设置索引而不是我追加数据帧的开始,因为任何启动的数据帧都必须在某种开头有一个索引。