下面是我希望表示为直方图的数据框,每行作为一个点。这不会有趣,因为这将给我三个相同大小的箱子。现在还可以,所以请继续阅读!
>>> outer_df
patient cell product
0 Pat_1 22RV1_PROSTATE 12
1 Pat_1 DU145_PROSTATE 15
2 Pat_1 LN18_CENTRAL_NERVOUS_SYSTEM 9
3 Pat_2 22RV1_PROSTATE 12
4 Pat_2 DU145_PROSTATE 15
5 Pat_2 LN18_CENTRAL_NERVOUS_SYSTEM 9
6 Pat_3 22RV1_PROSTATE 12
7 Pat_3 DU145_PROSTATE 15
8 Pat_3 LN18_CENTRAL_NERVOUS_SYSTEM 9
将每一行描绘为直方图上的一个点,但也能够挑选出一组特定数据(例如,所有单元格中的所有点都是紫色,属于DU145_PROSTATE
的那些点将在红色,并且22RV1_PROSTATE
为蓝色)并将其绘制为叠加的直方图。我用pandas docs:
我首先尝试对DataFrames使用hist
方法,但遇到了一个错误,以及一个空白的4x4系列直方图。
>>> outer_df.hist()
Traceback (most recent call last):
File "/usr/lib/python3.3/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/usr/lib/python3/dist-packages/pandas/tools/plotting.py", line 1977, in hist_frame
ax.hist(data[col].dropna().values, **kwds)
File "/usr/lib/python3/dist-packages/matplotlib/axes.py", line 8099, in hist
xmin = min(xmin, xi.min())
TypeError: unorderable types: str() < float()
实现DataFrame.hist()
&#34;绘制多个子图上列的直方图&#34;,远离此并尝试outer_df.plot(kind='hist', stacked=True)
。即使我直接从文档中获取了这些内容,我仍然坚持这个错误:
>>> outer_df.plot(kind='hist', stacked=True)
Traceback (most recent call last):
File "/usr/lib/python3.3/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/usr/lib/python3/dist-packages/pandas/tools/plotting.py", line 1612, in plot_frame
raise ValueError('Invalid chart type given %s' % kind)
ValueError: Invalid chart type given hist
>>> outer_df.set_index(['patient', 'cell']).unstack('cell').plot(kind='hist', stacked=True)
Traceback (most recent call last):
File "/usr/lib/python3.3/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/usr/lib/python3/dist-packages/pandas/tools/plotting.py", line 1612, in plot_frame
raise ValueError('Invalid chart type given %s' % kind)
ValueError: Invalid chart type given hist
答案 0 :(得分:0)
怎么样:
outer_df.set_index(['patient', 'cell']).unstack('cell').plot(kind='hist', stacked=True)
答案 1 :(得分:0)
使用groupby
方法如何:
hist_data = { cell: outer_df.ix[inds,'product'] for cell,inds in outer_df.groupby('cell').groups.iteritems() }
dict中的每个值都是一个Series,对应于单元格组。接下来,迭代单元格组,每次绘制直方图:
for cell in hist_data:
hist_data[cell].hist(label=cell)
#pylab.legend() # need to call this to make sure the legend shows