为什么将直方图数据的限制自动检测为[nan,nan]而不是丢弃NaN?

时间:2019-05-19 13:14:40

标签: pandas histogram

以下代码会产生错误

print(g['resp'])
par = {'hist': True, 'kde': False, 'fit': scipy.stats.norm, 'bins': 'auto'}
sns.distplot(g['resp'], color='blue', **par)

31     23.0
32     28.0
33     29.0
34     31.0
35     32.0
36     35.0
37     35.0
38     36.0
39     37.0
40     38.0
41     38.0
42     38.0
43     41.0
44     42.0
45     42.0
46     42.0
47     42.0
48     46.0
49     48.0
50     49.0
51     50.0
52     52.0
53     55.0
54     56.0
55     60.0
56     60.0
57    100.0
58      NaN
59      NaN
60      NaN
61      NaN
Name: resp, dtype: float64
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-23-42944bf1e405> in <module>
      1 print(g['resp'])
      2 par = {'hist': True, 'kde': False, 'fit': scipy.stats.norm, 'bins': 'auto'}
----> 3 sns.distplot(g['resp'], color='blue', **par)

C:\ProgramData\Anaconda3\lib\site-packages\seaborn\distributions.py in distplot(a, bins, hist, kde, rug, fit, hist_kws, kde_kws, rug_kws, fit_kws, color, vertical, norm_hist, axlabel, label, ax)
    223         hist_color = hist_kws.pop("color", color)
    224         ax.hist(a, bins, orientation=orientation,
--> 225                 color=hist_color, **hist_kws)
    226         if hist_color != color:
    227             hist_kws["color"] = hist_color

C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\__init__.py in inner(ax, data, *args, **kwargs)
   1808                         "the Matplotlib list!)" % (label_namer, func.__name__),
   1809                         RuntimeWarning, stacklevel=2)
-> 1810             return func(ax, *args, **kwargs)
   1811 
   1812         inner.__doc__ = _add_data_doc(inner.__doc__,

C:\ProgramData\Anaconda3\lib\site-packages\matplotlib\axes\_axes.py in hist(self, x, bins, range, density, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, normed, **kwargs)
   6589             # this will automatically overwrite bins,
   6590             # so that each histogram uses the same bins
-> 6591             m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
   6592             m = m.astype(float)  # causes problems later if it's an int
   6593             if mlast is None:

C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\histograms.py in histogram(a, bins, range, normed, weights, density)
    708     a, weights = _ravel_and_check_weights(a, weights)
    709 
--> 710     bin_edges, uniform_bins = _get_bin_edges(a, bins, range, weights)
    711 
    712     # Histogram is an integer or a float array depending on the weights.

C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\histograms.py in _get_bin_edges(a, bins, range, weights)
    331                             "bins is not supported for weighted data")
    332 
--> 333         first_edge, last_edge = _get_outer_edges(a, range)
    334 
    335         # truncate the range if needed

C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\histograms.py in _get_outer_edges(a, range)
    259         if not (np.isfinite(first_edge) and np.isfinite(last_edge)):
    260             raise ValueError(
--> 261                 "autodetected range of [{}, {}] is not finite".format(first_edge, last_edge))
    262 
    263     # expand empty range to avoid divide by zero

ValueError: autodetected range of [nan, nan] is not finite

似乎NaN值引起了麻烦-如何丢弃它们?

1 个答案:

答案 0 :(得分:1)

我认为不是,所以可能的解决方案是Series.dropna来删除缺失的值:

sns.distplot(g['resp'].dropna(), color='blue', **par)