我想删除熊猫value_counts function()之后计数为0的类别
我的数据如下:
categories:
Index(['Average', 'Good', 'Poor', ,'VeryGood', 'VeryPoor'],
dtype='object')
Output of value counts:
score Frequency
VG 21
G 15
A 63
P 27
VP 0
我的结果应为
score Frequency
VG 21
G 15
A 63
P 27
我想将其存储在数据框中并绘制条形图。我不想在图中显示VP,因为它的计数为0,因此消除了该类别
我的代码:
quality_scores=quality.SCORE.value_counts()
quality_scores=pd.Series.to_frame(quality_scores)
quality_scores=quality_scores.rename(columns={'SCORE':
'Frequency'})
quality_scores['Score']=quality_scores.index
quality_scores=quality_scores.reset_index(drop=True)
quality_scores = quality_scores[quality_scores.Frequency != 0]
quality_scores
我正在根据评论编辑答案:
打印数据框时,我得到正确的答案。但是,当我使用quality_scores ['Score']。cat.categories检查类别时,我仍然看到不应显示的VP类别。
此外,在图形中,我不希望看到VP类别,而是将其显示在轴上。
以下是该图的代码:
plt.figure(figsize=(15,7))
quality_graph=sns.barplot(y=quality_scores["Frequency"],
x=quality_scores["Score"])
quality_graph.set_xlabel('Frequency')
quality_graph.set_title('Score Distribution of Quality
Measure:',fontsize=25)
plt.savefig('graphs\\Quality_Measure.png')
答案 0 :(得分:0)
请记住,情况很重要:“分数”和“分数”不同。您创建了两列,一列称为“ SCORE”,另一列称为“ Score”。
我运行了以下代码,它按预期运行。
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
grades = ['VG','G','A','P','VP']
counts = [21,15,63,27,0]
d = { 'Score' : grades, 'Frequency': counts }
quality_scores = pd.DataFrame(data = d)
quality_scores=quality_scores.reset_index(drop=True)
quality_scores = quality_scores[quality_scores.Frequency != 0]
plt.figure(figsize=(15,7))
quality_graph=sns.barplot(y=quality_scores['Frequency'], x=quality_scores['Score'])
quality_graph.set_xlabel('Frequency')
quality_graph.set_title('Score Distribution of Quality Measure:',fontsize=25)
plt.savefig('Quality_Measure.png')
答案 1 :(得分:0)
这是因为VP
仍然是该系列的属性。从熊猫0.23开始,您可以将observed=True
传递到groupby
中以从数据中删除未观察到的类别:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html