Question

我试图为我们拥有的数据制作一些情节和统计数据。当我在范围的开头和结尾处的日期值为零时，下面的代码失败了。例如，如果02/10/17的值为零，而03/10/17的实际值，则重采样不包括02/10/17日期。这会影响平均值，我希望它们显示在图上。

Gamesdf = df[df['Product Group'] == 'Video Games']    

fig = plt.figure()
ax2 = fig.add_subplot(111)
ax22 = ax2.twinx()

s1 = Gamesdf.resample('D', on='Created').size()
s2 = Gamesdf.groupby('Created')['Machine Count'].first()
s = s1/s2

s1.plot(kind='bar', ax=ax2, position=0, label='Sales Total', width=0.25)
s.plot(kind='bar', ax=ax22, color='red', position=1, label='Adjusted For Machine Count', width=0.25)

ax2.set_ylim(0,80)
ax22.set_ylim(0,80)
plt.legend(loc='upper left')

ticklabels = s.index.strftime('%Y-%m-%d')
ax.xaxis.set_major_formatter(ticker.FixedFormatter(ticklabels))

plt.show()

print('Average Deposited Per Day:', s1.mean())
print('Average Deposited Per Day Per Machine:', s.mean())

如何应用我原始数据框中找到的日期范围-df - 并强制它进入s1和s2系列？

[！[在此处输入图像说明] [1]] [1]

此示例应于03/10/17至30/10/17

运行

数据样本：

df.sample(25).sort_values('Created')

    Created Product Group   Machine Count
51  2017-10-09  Wireless    3
191 2017-10-14  DVDs & Blu-rays 3
87  2017-10-14  DVDs & Blu-rays 3
74  2017-10-14  DVDs & Blu-rays 3
152 2017-10-14  DVDs & Blu-rays 3
243 2017-10-17  DVDs & Blu-rays 3
255 2017-10-17  DVDs & Blu-rays 3
334 2017-10-18  DVDs & Blu-rays 6
419 2017-10-21  DVDs & Blu-rays 11
478 2017-10-21  Video Games 11
499 2017-10-21  Video Games 11
502 2017-10-21  Music   11
371 2017-10-21  DVDs & Blu-rays 11
610 2017-10-22  Video Games 11
675 2017-10-23  DVDs & Blu-rays 11
766 2017-10-23  Video Games 11
738 2017-10-23  DVDs & Blu-rays 11
866 2017-10-25  DVDs & Blu-rays 11
871 2017-10-25  DVDs & Blu-rays 11
806 2017-10-25  DVDs & Blu-rays 11
907 2017-10-25  Video Games 11
993 2017-10-26  DVDs & Blu-rays 11
938 2017-10-26  DVDs & Blu-rays 11
997 2017-10-26  Music   11
1115    2017-10-29  DVDs & Blu-rays 11

重新采样中丢失日期范围（零值时）

0 个答案: