将时间序列神经网络与前馈神经网络连接

时间:2019-01-17 22:08:38

标签: python tensorflow keras neural-network deep-learning

请考虑以下示例问题:

# dummy data for a SO question
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
from keras.models import Model
from keras.layers import Input, Conv1D, Dense
from keras.optimizers import Adam, SGD

time = np.array(range(100))
brk = np.array((time>40) & (time < 60)).reshape(100,1)
B = np.array([5, -5]).reshape(1,2)
np.dot(brk, B)
y = np.c_[np.sin(time), np.sin(time)] + np.random.normal(scale = .2, size=(100,2))+ np.dot(brk, B)



plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])

enter image description here

您有N个时间序列,他们有一个遵循相同流程的组件,而另一个组件与该序列本身是特异的。为简单起见,假设您先验地知道该凸点在40到60之间,并且您希望同时使用正弦分量对其进行建模。

TCN在通用组件上做得很好,但无法获得序列特有的组件:

# time series model
n_filters = 10
filter_width = 3
dilation_rates = [2**i for i in range(7)] 
inp = Input(shape=(None, 1))
x = inp
for dilation_rate in dilation_rates:
    x = Conv1D(filters=n_filters,
               kernel_size=filter_width, 
               padding='causal',
               activation = "relu",
               dilation_rate=dilation_rate)(x)
x = Dense(1)(x)
model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.summary()

X_train = np.transpose(np.c_[time, time]).reshape(2,100,1)
y_train = np.transpose(y).reshape(2,100,1)

history = model.fit(X_train, y_train,
                batch_size=2,
                epochs=1000,
                verbose = 0)
yhat = model.predict(X_train)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])

plt.plot(time, yhat[0,:,:])
plt.plot(time, yhat[1,:,:])

enter image description here

另一方面,具有N个输出的基本线性回归(此处在Keras中实现)非常适合特质分量:

inp1 = Input((1,))
x1 = inp1
x1 = Dense(2)(x1)
model1 = Model(inputs = inp1, outputs = x1)
model1.compile(optimizer = Adam(), loss='mean_squared_error')
model1.summary()

brk_train = brk
y_train = y
history = model1.fit(brk_train, y_train,
                batch_size=100,
                epochs=6000, verbose = 0)
yhat1 = model1.predict(brk_train)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])
plt.plot(time, yhat1[:,0])
plt.plot(time, yhat1[:,1])

enter image description here

我想使用keras来共同估计时间序列成分和特异成分。主要问题是前馈网络(线性回归是特例)的形状{{ 1}},而时序网络的维度为batch_size x dims

因为我想共同估计模型的特质部分(线性回归部分)以及时间序列部分,所以我只打算批量采样整个时间序列。这就是为什么我为模型1指定batch_size x time_steps x dims的原因。

但是在静态模型中,我真正要做的是将数据建模为batch_size = time_steps

我试图将前馈模型重新转换为时间序列模型,但没有成功。这是无效的方法:

time_steps x dims

我正在尝试拟合与model1相同的模型,但形状不同,以使其与TCN模型兼容-重要的是,它将具有相同的批处理结构。

在此示例中,输出最终应具有inp3 = Input(shape = (None, 1)) x3 = inp3 x3 = Dense(2)(x3) model3 = Model(inputs = inp3, outputs = x3) model3.compile(optimizer = Adam(), loss='mean_squared_error') model3.summary() brk_train = brk.reshape(1, 100, 1) y_train = np.transpose(y).reshape(2,100,1) history = model3.fit(brk_train, y_train, batch_size=1, epochs=1000, verbose = 1) ValueError: Error when checking target: expected dense_40 to have shape (None, 2) but got array with shape (100, 1) 的形状。基本上,我希望模型执行以下算法:

  • 形状为(2, 100, 1)的{​​{1}}
  • 丢失第一维,因为每个序列的设计矩阵都将相同,从而产生形状为X的{​​{1}}
  • 前进步骤:(N, time_steps, dims),其中X1的尺寸为(time_steps, dims),产生的np.dot(X1, W)尺寸为W
  • (dims, N)重塑为X2。然后,可以将其添加到模型其他部分的输出中。
  • 后退步骤:由于这只是线性模型,因此(time_steps, N)相对于输出的坡度仅为X2

我该如何实现?我需要自定义图层吗?

如果您对所有这些背后的动机感到好奇,我正在this paper中提出一些想法。

编辑:发布后,我注意到我只使用了时间变量,而不是时间序列本身。滞后序列适合的TCN适合该序列的特质部分(无论如何都是样本)。但是我的基本问题仍然存在-我想合并两种类型的网络。

1 个答案:

答案 0 :(得分:1)

So, I solved my own problem. The answer is to create dummy interactions (and a thus a really sparse design matrix) and then reshape the data.

###########################
# interaction model
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn-whitegrid')
from keras.models import Model
from keras.layers import Input, Conv1D, Dense
from keras.optimizers import Adam, SGD
from patsy import dmatrix


def shift5(arr, num, fill_value=np.nan):
    result = np.empty_like(arr)
    if num > 0:
        result[:num] = fill_value
        result[num:] = arr[:-num]
    elif num < 0:
        result[num:] = fill_value
        result[:num] = arr[-num:]
    else:
        result = arr
    return result


time = np.array(range(100))
brk = np.array((time>40) & (time < 60)).reshape(100,1)
B = np.array([5, -5]).reshape(1,2)
np.dot(brk, B)
y = np.c_[np.sin(time), np.sin(time)] + np.random.normal(scale = .2, size=(100,2))+ np.dot(brk, B)

plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])

# define interaction model
inp = Input(shape=(None, 2))
x = inp
x = Dense(1)(x)
model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.summary()

from patsy import dmatrix
df = pd.DataFrame(data = {"fips": np.concatenate((np.zeros(100), np.ones(100))),
                          "brk": np.concatenate((brk.reshape(100), brk.squeeze()))})
df.brk = df.brk.astype(int)
tm = np.asarray(dmatrix("brk:C(fips)-1", data = df))

brkint = np.concatenate(( \
                tm[:100,:].reshape(1,100,2),
                tm[100:200,:].reshape(1,100,2)
                ), axis = 0)

y_train = np.transpose(y).reshape(2,100,1)

history = model.fit(brkint, y_train,
                batch_size=2,
                epochs=1000,
                verbose = 1)

yhat = model.predict(brkint)
plt.clf()
plt.plot(time, y[:,0])
plt.plot(time, y[:,1])

plt.plot(time, yhat[0,:,:])
plt.plot(time, yhat[1,:,:])

enter image description here

The output shape is the same as for the TCN, and can simply be added element-wise.

相关问题