密谋不显示所有数据

时间:2018-12-22 16:58:18

标签: python pandas matplotlib plotly

当尝试使用mathplotlibplotly绘制相同的数据时,得到不同的结果。 Plotly并没有显示整个数据范围。

import plotly.plotly as py
import plotly.graph_objs as go

# filter the data
df3 = df[df.line_item_returned==0][['created_at', 'line_item_price']].copy()
# remove the time part from datetime
df3.created_at = df3.created_at.dt.floor('d')
# set the datatime column as index
df3 = df3.set_index('created_at')

# Create traces
trace0 = go.Scatter(
    x = df3.index,
    y = df3.line_item_price.resample('d').sum().rolling(90, center=True).mean(),
    mode = 'markers',
    name = 'markers'
)
data = [trace0]
py.iplot(data, filename='scatter-mode')

该图表仅显示了2018年10月至12月的范围。

使用matplotlib绘制相同的数据将显示2016-2018年的整个数据范围:

import matplotlib.pyplot as plt
%matplotlib inline

plt.plot(df3.line_item_price.resample('d').sum().rolling(90, center=True).mean())

enter image description here

该索引包含2016-2018年的所有数据:

df3.line_item_price.resample('d').sum().rolling(31, center=True).mean().index 

DatetimeIndex(['2015-11-18', '2015-11-19', '2015-11-20', '2015-11-21',
               '2015-11-22', '2015-11-23', '2015-11-24', '2015-11-25',
               '2015-11-26', '2015-11-27',
               ...
               '2018-12-10', '2018-12-11', '2018-12-12', '2018-12-13',
               '2018-12-14', '2018-12-15', '2018-12-16', '2018-12-17',
               '2018-12-18', '2018-12-19'],
              dtype='datetime64[ns]', name='created_at', length=1128, freq='D')

为什么会这样?

1 个答案:

答案 0 :(得分:2)

我想这是索引问题。

%matplotlib inline
import plotly.offline as py
import plotly.graph_objs as go
import pandas as pd
import numpy as np

N = 2000
df = pd.DataFrame({"value":np.random.randn(N)},
                  index=pd.date_range(start='2015-01-01', periods=N))
# you don't really need to us `plt`
df.resample('d').sum().rolling(90, center=True).mean().plot();

enter image description here

但是如果您想使用plotly,则应该使用重新采样的Series中的索引。

df_rsmpl = df.resample('d').sum().rolling(90, center=True).mean()

trace0 = go.Scatter(x = df_rsmpl.index,
                    y = df_rsmpl["value"])
data = [trace0]
py.iplot(data)

enter image description here