如何使用auto_arima跳过特定日期

时间:2019-12-16 17:22:34

标签: pandas dataframe arima

我正在尝试使用(auto_arima)排除特定日期进行预测。

代码:

from pmdarima.arima import auto_arima

df = pd.read_csv('devices_transactions_count.csv')

def remove_holidays(date, transactions):
    if date in ['2019-06-03', '2019-06-04', '2019-06-05', '2019-06-06', '2019-06-07', '2019-06-08', '2019-06-09',
                       '2019-08-08', '2019-08-09', '2019-08-10', '2019-08-11', '2019-08-12', '2019-08-13','20199-08-14',
                       '2019-08-15', '2019-08-16']:
        return None
    else:
        return transactions
df['transactions'] = df.index.map(lambda i: remove_holidays(df.date.iloc[i], df.transactions.iloc[i]))
df.head()

train = df[df.date < '2019-09-20']
train.to_csv('train.csv')
train = pd.read_csv('train.csv')
del train['Unnamed: 0']
train.head()

train['transactions'] = train['transactions'].astype('float32')
train['date'].replace(regex=True, inplace=True, to_replace='M', value='')
train['date'] = pd.to_datetime(train['date'], format='%Y%m', errors='ignore', infer_datetime_format=True)
train = train.set_index(['date'])


decomposition = auto_arima(train.transactions, start_p=1, start_q=1,
                           max_p=3, max_q=3, m=12,
                           start_P=0, seasonal=True,
                           d=1, D=1, trace=True,
                           error_action='ignore',  
                           suppress_warnings=True, 
                           stepwise=True)

这将引发以下错误:ValueError:输入包含NaN,无穷大或对于dtype('float64')而言太大的值。

1 个答案:

答案 0 :(得分:1)

我会将您的清理功能重写为列表查找:

skip_days = ['2019-06-03', '2019-06-04', '2019-06-05', '2019-06-06', '2019-06-07', '2019-06-08', '2019-06-09','2019-08-08', '2019-08-09', '2019-08-10', '2019-08-11', '2019-08-12', '2019-08-13','20199-08-14','2019-08-15', '2019-08-16']

# Exclude these days
df_filtered = df[~df['date'].isin(skip_days)]

这将从数据框中排除这些值,从而使您的数据集可以从nan / null值中清除。