Matplotlib直方图与非数值数据

时间:2019-07-02 10:45:42

标签: python matplotlib histogram

不能在Matplotlib中用非数值数据绘制直方图。

A = na,R,O,na,na,O,R ...

A是一个采用3个不同值的数据框:na,R,O

我尝试:

plt.hist(A, bins=3, color='#37777D')

会期望像这样的Result

它适用于数值数据,但对于非数值数据,我会收到此错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-44-60369a6f9af4> in <module>
      1 A = dataset2.iloc[:, 2 - 1].head(30)
----> 2 plt.hist(A, bins=3, histtype='bar', color='#37777D')

C:\Anaconda\lib\site-packages\matplotlib\pyplot.py in hist(x, bins, range, density, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, normed, data, **kwargs)
   2657         align=align, orientation=orientation, rwidth=rwidth, log=log,
   2658         color=color, label=label, stacked=stacked, normed=normed,
-> 2659         **({"data": data} if data is not None else {}), **kwargs)
   2660 
   2661 

C:\Anaconda\lib\site-packages\matplotlib\__init__.py in inner(ax, data, *args, **kwargs)
   1808                         "the Matplotlib list!)" % (label_namer, func.__name__),
   1809                         RuntimeWarning, stacklevel=2)
-> 1810             return func(ax, *args, **kwargs)
   1811 
   1812         inner.__doc__ = _add_data_doc(inner.__doc__,

C:\Anaconda\lib\site-packages\matplotlib\axes\_axes.py in hist(self, x, bins, range, density, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, normed, **kwargs)
   6563                     "color kwarg must have one color per data set. %d data "
   6564                     "sets and %d colors were provided" % (nx, len(color)))
-> 6565                 raise ValueError(error_message)
   6566 
   6567         # If bins are not specified either explicitly or via range,

ValueError: color kwarg must have one color per data set. 30 data sets and 1 colors were provided

2 个答案:

答案 0 :(得分:1)

我认为您需要条形图而不是直方图。此外,还不清楚您的价值观是什么。考虑到它们是字符串(基于图),您需要首先使用Counter模块来计算它们的频率。然后,您可以绘制频率并将按键名称指定为刻度标签。

from collections import Counter
from matplotlib import pyplot as plt

A = ['na', 'R', 'O', 'na', 'na', 'R']

freqs = Counter(A)

xvals = range(len(freqs.values()))
plt.bar(xvals, freqs.values() , color='#37777D')
plt.xticks(xvals, freqs.keys())
plt.show() 

enter image description here

答案 1 :(得分:0)

这是不可复制的。但是,如果我们创建一个数据框并运行以下代码

import numpy as np; np.random.seed(42)
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.choice(["na", "O", "A"], size=10))

plt.hist(df.values, histtype='bar', bins=3)

plt.show()

enter image description here

现在,无论如何这都不是最佳选择,因为直方图在定义上是连续的。因此,可以创建一个计数条形图。

import numpy as np; np.random.seed(42)
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame(np.random.choice(["na", "O", "A"], size=10))

counts = df[0].value_counts()
plt.bar(counts.index, counts.values)

plt.show()

enter image description here