当尝试计算熊猫数据框中的某些概率时,我得到KeyError:“ ENABLEchain”,当我在此站点上查找时,结果为零。
这种情况是,我加载了多个.csv文件,并计算了文件中行的概率。 .cvs文件如下所示:
"sequence","support"
"<{10052Regainstart},{10053Regainready}>",0.994708994708995
"<{10125Programstopped},{10052Regainstart},{10053Regainready}>",0.994708994708995
"<{10125Programstopped},{10052Regainstart}>",0.994708994708995
我的代码如下:
# import the files in a glob set of files, ordered!
files = sorted(glob.glob("/Users/.../threshold_value_*.csv"))
# loop over all the files in glob, calculate the probabilities
for f in files:
rules = pd.read_csv(f, usecols=[0,1])
rules['sequence'] = rules['sequence'].str.replace('{|}|<|>','') #only keep the comma's
rules.insert(1, 'length', rules['sequence'].str.count(',') + 1) # count number of events in the pattern
probability = rules['sequence'].str.split(',').apply(lambda x: get_prob(*x)) #calculate markov chain prob
rules
函数get_prob如下:
def get_prob(*args):
ret = 1
for i, j in zip(args, args[1:]):
ret *= probs.at[i,j]
return ret
计算模式.csv文件中行的概率。概率数据帧看起来像这样,因此使用* args和.at [i,j]我们可以计算出模式.csv文件中所有行的概率:
event23 event24 event34
event23 0.34 0.88 0.54
event24 0.87 0.45 0.33
event34 0.98 0.23 0.34
错误代码如下:
KeyError Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'ENABLEchain'
During handling of the above exception, another exception occurred:
KeyError Traceback (most recent call last)
<ipython-input-59-c914483e45d6> in <module>
12 rules['sequence'] = rules['sequence'].str.replace('{|}|<|>','') #replace some characters
13 rules.insert(1, 'length', rules['sequence'].str.count(',') + 1) # count number of events in the pattern
---> 14 probability = rules['sequence'].str.split(',').apply(lambda x: get_prob(*x)) #calculate markov chain prob
15
16 rules
~/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
3589 else:
3590 values = self.astype(object).values
-> 3591 mapped = lib.map_infer(values, f, convert=convert_dtype)
3592
3593 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-59-c914483e45d6> in <lambda>(x)
12 rules['sequence'] = rules['sequence'].str.replace('{|}|<|>','') #replace some characters
13 rules.insert(1, 'length', rules['sequence'].str.count(',') + 1) # count number of events in the pattern
---> 14 probability = rules['sequence'].str.split(',').apply(lambda x: get_prob(*x)) #calculate markov chain prob
15
16 rules
<ipython-input-57-94f2040b6b35> in get_prob(*args)
2 ret = 1
3 for i, j in zip(args, args[1:]):
----> 4 ret *= probs.at[i,j]
5
6 return ret
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
2268
2269 key = self._convert_key(key)
-> 2270 return self.obj._get_value(*key, takeable=self._takeable)
2271
2272 def __setitem__(self, key, value):
~/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py in _get_value(self, index, col, takeable)
2765 return com.maybe_box_datetimelike(series._values[index])
2766
-> 2767 series = self._get_item_cache(col)
2768 engine = self.index._engine
2769
~/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
3059 res = cache.get(item)
3060 if res is None:
-> 3061 values = self._data.get(item)
3062 res = self._box_item_values(item, values)
3063 cache[item] = res
~/anaconda3/lib/python3.7/site-packages/pandas/core/internals/managers.py in get(self, item, fastpath)
939
940 if not isna(item):
--> 941 loc = self.items.get_loc(item)
942 else:
943 indexer = np.arange(len(self.items))[isna(self.items)]
~/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
2657 return self._engine.get_loc(key)
2658 except KeyError:
-> 2659 return self._engine.get_loc(self._maybe_cast_indexer(key))
2660 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
2661 if indexer.ndim > 1 or indexer.size > 1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'ENABLEchain'
我之前从未遇到过此错误,并且google / stackoverflow给我0个结果,有人知道更多吗?
谢谢!