我有一个简单的Pandas数据帧:
delta
,start_hour
和end_hour
都是numpy.int64
:
type(df.delta[0])
->numpy.int64
每当我尝试使用Pandas方法进行散点图时,我得到“IndexError:indices is out-of-bounds”。例如:
sc2 = df.plot.scatter(x=df.delta, y=df.start_hour)
产生
IndexError
Traceback (most recent call last)
<ipython-input-118-4d521c29b97f> in <module>()
----> 1 sc2 = df.plot.scatter(x=df.delta, y=df.start_hour)
...
/mnt/xarfuse/uid-116535/[edit]/pandas/core/indexing.pyc in maybe_convert_indices(indices, n)
IndexError: indices are out-of-bounds
我也尝试过显式转换为Numpy数组,如this post:
中所述df_x = np.array(df['delta'].tolist())
df_y = np.array(df['start_hour'].tolist())
sc1 = df.plot.scatter(x=df_x, y=df_y)
产生相同的错误。
我确信我错过了真的简单的东西。帮助赞赏!
答案 0 :(得分:4)
当您将df ['delta']传递给x
时,它会像df[df['delta']]
一样返回key error : not in index
,因此您必须简单地将列名称传递给分散方法为x和y值即
sc2 = df.plot.scatter(x='delta', y='start_hour')
示例
df = pd.DataFrame({'delta':[162,9,9,38,691,58],'start_hour':[1,5,11,1,7,6],'last_hour':[3,5,11,2,19,7]})
sc2 = df.plot.scatter(x='delta', y='start_hour')
plt.show()
如果你想传递numpy数组,那么就不要在df中搜索它。即direclty使用plt.scatter
例如
df_x = np.array(df['delta'].tolist())
df_y = np.array(df['start_hour'].tolist())
plt.scatter(x=df_x, y=df_y)
plt.show()
希望有所帮助