在Python中使用Counter()来构建直方图?

时间:2013-10-05 14:36:46

标签: python histogram

我在另一个问题上看到我可以使用Counter()来计算一组字符串中出现的次数。因此,如果我['A','B','A','C','A','A'],我会Counter({'A':3,'B':1,'C':1})。但现在,我如何使用该信息来构建直方图?

3 个答案:

答案 0 :(得分:57)

对于您的数据,最好使用条形图而不是直方图。看看这段代码:

from collections import Counter
import numpy as np
import matplotlib.pyplot as plt


labels, values = zip(*Counter(['A','B','A','C','A','A']).items())

indexes = np.arange(len(labels))
width = 1

plt.bar(indexes, values, width)
plt.xticks(indexes + width * 0.5, labels)
plt.show()

结果: enter image description here

答案 1 :(得分:14)

您可以使用pandas编写一些非常简洁的代码来执行此操作:

In [24]: import numpy as np

In [25]: from pandas import Series

In [27]: sample = np.random.choice(['a', 'b'], size=10)

In [28]: s = Series(sample)

In [29]: s
Out[29]:
0    a
1    b
2    b
3    b
4    a
5    b
6    b
7    b
8    b
9    a
dtype: object

In [30]: vc = s.value_counts()

In [31]: vc
Out[31]:
b    7
a    3
dtype: int64

In [32]: vc = vc.sort_index()

In [33]: vc
Out[33]:
a    3
b    7
dtype: int64

In [34]: vc.plot(kind='bar')

导致:

enter image description here

答案 2 :(得分:0)

基于Igonato的回答,我创建了一个具有“情节”类的帮助程序模块“情节”。

它具有两个函数hist()和barchart(),两个函数显示了Igonato的方法,并且直接使用了matplotlib hist函数,这可能是问题的初衷。

该方法允许添加具有给定字体大小的标题和标签,并以1的刻度频率显示y轴。您还可以更改模式,以保存带有给定标题的图表。为了方便起见,有关闭和调试选项。

python单元测试test_Plot.py

'''
Created on 2020-07-05

@author: wf
'''
import unittest


from ptp.plot import Plot

class TestPlot(unittest.TestCase):


    def setUp(self):
        pass


    def tearDown(self):
        pass


    def testPlot(self):
        ''' test a plot based on a Counter '''
        valueList=['A','B','A','C','A','A'];
        plot=Plot(valueList,"barchart example",xlabel="Char",ylabel="frequency")
        plot.barchart(mode='save')
        plot.title="histogram example"
        plot.debug=True
        plot.hist(mode='save')        
        pass


if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']
    unittest.main()

结果barchart example histogram example

调试输出:

   value  list:  ['A', 'B', 'A', 'C', 'A', 'A']
counter  items:  dict_items([('A', 4), ('B', 1), ('C', 1)])
counter values:  dict_values([4, 1, 1])
counter   keys:  dict_keys(['A', 'B', 'C'])

plot.py

    '''
Created on 2020-07-05

@author: wf
'''
import matplotlib.pyplot as plt
from collections import Counter
import numpy as np
import os

class Plot(object):
    '''
    create Plot based on counters
    see https://stackoverflow.com/questions/19198920/using-counter-in-python-to-build-histogram
    '''
    def __init__(self, valueList,title,xlabel=None,ylabel=None,fontsize=12,plotdir=None,debug=False):
        '''
        Constructor
        '''
        self.counter=Counter(valueList)
        self.valueList=valueList
        self.title=title
        self.xlabel=xlabel
        self.ylabel=ylabel
        self.fontsize=fontsize
        self.debug=debug
        path=os.path.dirname(__file__)
        if plotdir is not None:
            self.plotdir=plotdir
        else:
            self.plotdir=path+"/../plots/"
            os.makedirs(self.plotdir,exist_ok=True)
            
    def titleMe(self):        
        plt.title(self.title, fontsize=self.fontsize)
        if self.xlabel is not None:
            plt.xlabel(self.xlabel)
        if self.ylabel is not None:    
            plt.ylabel(self.ylabel)
            
    def showMe(self,mode='show',close=True):
        ''' show me in the given mode '''
        if mode=="show":
            plt.show() 
        else:
            plt.savefig(self.plotdir+self.title+".jpg")
        if close:    
            plt.close()    
            
    def barchart(self,mode='show'):
        ''' barchart based histogram for the given counter '''
        labels, values = zip(*self.counter.items())
        indexes = np.arange(len(labels))
        width = 1
        self.titleMe()
        plt.bar(indexes, values, width)
        plt.xticks(indexes + width * 0.5, labels)
        plt.yticks(np.arange(1,max(values)+1,step=1))
        self.showMe(mode)
        
    def showDebug(self):    
        print("   value  list: ",self.valueList)
        print("counter  items: ",self.counter.items())
        print("counter values: ",self.counter.values())
        print("counter   keys: ",self.counter.keys())
        
    def hist(self,mode="show"):
        ''' create histogram for the given counter '''
        if self.debug:
            self.showDebug()
        self.titleMe()
        # see https://stackoverflow.com/a/2162045/1497139
        plt.hist(self.valueList,bins=len(self.counter.keys()))
        self.showMe(mode)
        pass