Question

数据

下面是我希望表示为直方图的数据框，每行作为一个点。这不会有趣，因为这将给我三个相同大小的箱子。现在还可以，所以请继续阅读！

>>> outer_df
  patient                         cell  product
0   Pat_1               22RV1_PROSTATE       12
1   Pat_1               DU145_PROSTATE       15
2   Pat_1  LN18_CENTRAL_NERVOUS_SYSTEM        9
3   Pat_2               22RV1_PROSTATE       12
4   Pat_2               DU145_PROSTATE       15
5   Pat_2  LN18_CENTRAL_NERVOUS_SYSTEM        9
6   Pat_3               22RV1_PROSTATE       12
7   Pat_3               DU145_PROSTATE       15
8   Pat_3  LN18_CENTRAL_NERVOUS_SYSTEM        9

期望结果

将每一行描绘为直方图上的一个点，但也能够挑选出一组特定数据（例如，所有单元格中的所有点都是紫色，属于DU145_PROSTATE的那些点将在红色，并且22RV1_PROSTATE为蓝色）并将其绘制为叠加的直方图。我用pandas docs：

中的图片说明了这一点

尝试1

我首先尝试对DataFrames使用hist方法，但遇到了一个错误，以及一个空白的4x4系列直方图。

>>> outer_df.hist()
Traceback (most recent call last):
  File "/usr/lib/python3.3/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/pandas/tools/plotting.py", line 1977, in hist_frame
    ax.hist(data[col].dropna().values, **kwds)
  File "/usr/lib/python3/dist-packages/matplotlib/axes.py", line 8099, in hist
    xmin = min(xmin, xi.min())
TypeError: unorderable types: str() < float()

尝试2

实现DataFrame.hist()＆＃34;绘制多个子图上列的直方图＆＃34;，远离此并尝试outer_df.plot(kind='hist', stacked=True)。即使我直接从文档中获取了这些内容，我仍然坚持这个错误：

>>> outer_df.plot(kind='hist', stacked=True)
Traceback (most recent call last):
  File "/usr/lib/python3.3/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/pandas/tools/plotting.py", line 1612, in plot_frame
    raise ValueError('Invalid chart type given %s' % kind)
ValueError: Invalid chart type given hist

尝试3 - 由@ 816提供

>>> outer_df.set_index(['patient', 'cell']).unstack('cell').plot(kind='hist', stacked=True)
Traceback (most recent call last):
  File "/usr/lib/python3.3/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/pandas/tools/plotting.py", line 1612, in plot_frame
    raise ValueError('Invalid chart type given %s' % kind)
ValueError: Invalid chart type given hist

Answer 1

怎么样：

outer_df.set_index(['patient', 'cell']).unstack('cell').plot(kind='hist', stacked=True)

Answer 2

使用groupby方法如何：

hist_data = { cell: outer_df.ix[inds,'product'] for cell,inds in outer_df.groupby('cell').groups.iteritems() }

dict中的每个值都是一个Series，对应于单元格组。接下来，迭代单元格组，每次绘制直方图：

for cell in hist_data:
    hist_data[cell].hist(label=cell)
#pylab.legend() # need to call this to make sure the legend shows

来自pandas DataFrame的直方图

数据

期望结果

尝试1

尝试2

尝试3 - 由@ 816提供

2 个答案: