Question

我想在熊猫内部调用计数器值。

到目前为止的努力：

from __future__ import unicode_literals
import spacy,en_core_web_sm
from collections import Counter
import pandas as pd
nlp = en_core_web_sm.load()
c = Counter(([token.pos_ for token in nlp('The cat sat on the mat.')]))
sbase = sum(c.values())
for el, cnt in c.items():
    el, '{0:2.2f}%'.format((100.0* cnt)/sbase)
df = pd.DataFrame.from_dict(c, orient='index').reset_index()
print df

当前输出：

   index  0
0   NOUN  2
1   VERB  1
2    DET  2
3    ADP  1
4  PUNCT  1

预期输出：

下面的内部数据框：

(u'NOUN', u'28.57%')
(u'VERB', u'14.29%')
(u'DET', u'28.57%')
(u'ADP', u'14.29%')
(u'PUNCT', u'14.29%')

我要如何在数据框内调用el和cnt？

这是一个后续问题，我想列出POS分布的百分比。

Percentage Count Verb, Noun using Spacy?

我知道我需要将el和cnt组替换为下面的c：

df = pd.DataFrame.from_dict（c，orient ='index'）。reset_index（）

Answer 1

由于我没有原始数据，因此我只能修复您的输出

(df['0']/df['0'].sum()).map("{0:.2%}".format)
Out[827]: 
0    28.57%
1    14.29%
2    28.57%
3    14.29%
4    14.29%
Name: 0, dtype: object

反呼熊猫？

1 个答案: