如何使用python 3.7(Spyder)从数据计算读数的移动平均值?

时间:2019-02-06 07:57:37

标签: python regression analysis

我有3年每月的销售数据。我必须进行回归分析,并需要计算移动平均线和居中移动平均线。 我已经能够绘制销售价值。 现在,我需要绘制移动平均线和居中移动平均线,并存储这些值以进行进一步分析。以下是我所能做的。

我尝试取平均数,但无法计算MA和CMA并将其存储。

import matplotlib.pyplot as plt
from collections import Counter

def make_chart_simple_line_chart(plt):
    period = ['201601', '201602', 201603, 201604, 201605, 201606, 201607, 201608, 201609, 201610, 201611, 201612, 201701, 201702, 201703, 201704, 201705, 201706, 201707, 201708, 201709, 201710, 201711, 201712, 201801, 201802, 201803, 201804, 201805, 201806, 201807, 201808, 201809, 201810, 201811, 201812]
    sale = [9478, 9594, 14068, 9692, 9718, 14144, 9294, 10072, 14254, 10508, 11224, 17640, 11300, 11656, 17360, 11342, 12300, 17334, 11296, 12452, 16886, 11878, 13482, 19260, 13932, 13600, 20122, 13134, 14564, 19354, 13104, 13562, 17350, 12486, 12570, 17716]

    # create a line chart, period on x-axis, sale on y-axis
    plt.plot(period, sale, color='green', marker='o', linestyle='solid')

    # add a title
    plt.title("Sales Chart")

    # add a label to the y-axis
    plt.ylabel("number of contracts sold")
    plt.show()

if __name__ == "__main__":

    make_chart_simple_line_chart(plt)

我想使用可用数据预测2019年的销售价值。

2 个答案:

答案 0 :(得分:2)

移动平均:

pd.rolling_mean(df['column'], n)

EMA:

pd.ewma(df['column'], span = n, min_periods = n - 1)

答案 1 :(得分:0)

您的数据似乎具有两条单独的销售趋势线。这是我用来将日期格式转换为月份的代码,为清晰起见,销售额为数千: plot

import matplotlib
import matplotlib.pyplot as plt

period = [201601.0, 201602.0, 201603.0, 201604.0, 201605.0, 201606.0, 201607.0, 201608.0, 201609.0, 201610.0, 201611.0, 201612.0, 201701.0, 201702.0, 201703.0, 201704.0, 201705.0, 201706.0, 201707.0, 201708.0, 201709.0, 201710.0, 201711.0, 201712.0, 201801.0, 201802.0, 201803.0, 201804.0, 201805.0, 201806.0, 201807.0, 201808.0, 201809.0, 201810.0, 201811.0, 201812.0]
sale = [9478.0, 9594.0, 14068.0, 9692.0, 9718.0, 14144.0, 9294.0, 10072.0, 14254.0, 10508.0, 11224.0, 17640.0, 11300.0, 11656.0, 17360.0, 11342.0, 12300.0, 17334.0, 11296.0, 12452.0, 16886.0, 11878.0, 13482.0, 19260.0, 13932.0, 13600.0, 20122.0, 13134.0, 14564.0, 19354.0, 13104.0, 13562.0, 17350.0, 12486.0, 12570.0, 17716.0]

months = []
sales = []

for i in range(len(period)):
    if period[i] < 201700.0:
        month = period[i] - 201600.0
    elif period[i] < 201800.0:
        month = period[i] - 201700.0 + 12.0
    elif period[i] < 201900.0:
        month = period[i] - 201800.0 + 24.0
    months.append(month)
    sales.append(sale[i] / 1000.0)

plt.plot(months, sale,  'D')
plt.xlabel('Month')
plt.ylabel('Sales (thousands)')
plt.show()