Question

我发现DataFrame.plot.hist非常方便，但在这种情况下我找不到解决方案。

我想绘制数据集中许多列的分布。问题是大熊猫在所有x轴上保持相同的比例，使得大多数情节无用。以下是我使用的代码：

X.plot.hist(subplots=True, layout=(13, 6), figsize=(20, 45), bins=50, sharey=False, sharex=False)
plt.show()

以下是结果的一部分：

问题似乎是pandas在所有列上使用相同的bin，而不管它们的值如何。大熊猫有没有方便的解决方案，或者我被迫手工做？

我将数据集中在一起（零均值和单位方差），结果略有改善，但仍然无法接受。

Answer 1

有几个选项，这里是代码和输出：

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Dummy data - value ranges differ a lot between columns
X = pd.DataFrame()
for i in range(18):
    X['COL0{0}'.format(i+38)]=(2**i)*np.random.random(1000)

# Method 1 - just using the hist function to generate each plot
X.hist(layout=(3, 6), figsize=(20, 10), sharey=False, sharex=False, bins=50)
plt.title('Method 1')
plt.show()

# Method 2 - generate each plot separately
cols = plt.cm.spectral(np.arange(1,255,13))
fig, axes = plt.subplots(3,6,figsize=(20,10))
for index, column in enumerate(X.columns):
    ax = axes.flatten()[index]
    ax.hist(X[column],bins=50, label=column, fc=cols[index])
    ax.legend(loc='upper right')
    ax.set_ylim((0,1.2*ax.get_ylim()[1]))
fig.suptitle('Method 2')
fig.show()

第一个情节：

第二个情节：

我肯定会推荐第二种方法，因为您可以对各个图进行更多更多控制，例如，您可以更改轴刻度，标签，网格参数以及几乎任何其他方法。

我找不到任何可以让您修改原始 plot.hist 垃圾箱以接受单独计算的垃圾箱的内容。

我希望这有帮助！

如何在熊猫中使用不同的轴刻度＆＃39; DataFrame.plot.hist？

1 个答案: