我正在尝试复制Chevalier的LSTM Human Activity Recognition算法,当我意识到我的方法与算法的方法不匹配时遇到了问题。作为此question的后续内容,我可以通过此方法为load_X
生成结果:
在[0]:
def load_X(X_signals_paths):
X_signals = []
for signal_type_path in X_signals_paths:
with open(signal_type_path, 'r') as csvfile:
reader = csv.reader(csvfile)
next(reader)
for serie in [row[1:2] for row in reader]:
#X_signals.append([np.array([row[1:2] for row in reader],dtype=np.float32) for row in reader])
X_signals.append(np.array(serie, dtype=np.int32))
file.close()
return (np.transpose(np.transpose(X_signals), (1, 0)))
X_train_signals_paths = [
DATASET_PATH + TRAIN + signal + "_train.csv" for signal in INPUT_SIGNAL_TYPES
]
X_test_signals_paths = [
DATASET_PATH + TEST + signal + "_test.csv" for signal in INPUT_SIGNAL_TYPES
]
X_train = load_X(X_train_signals_paths)
X_test = load_X(X_test_signals_paths)
print(X_train)
输出[0]:
[[ 6]
[ 6]
...,
[13]
[13]
[13]]
然而,我更多地查看了Chevalier的方法,当我len(X_train[0])
和len(X_train[0][0])
时,我发现了一些有趣的东西。似乎我格式化x值的方式与Chevalier的x值非常不同。我的原始CSV文件可以找到here,并且可以找到Chevalier的X_train的原始txt文件here。以下是Chevalier的代码,用于与我的比较:
def load_X(X_signals_paths):
X_signals = []
for signal_type_path in X_signals_paths:
file = open(signal_type_path, 'r')
# Read dataset from disk, dealing with text files' syntax
X_signals.append(
[np.array(serie, dtype=np.float32) for serie in [
row.replace(' ', ' ').strip().split(' ') for row in file
]]
)
file.close()
return np.transpose(np.array(X_signals), (1, 2, 0))
X_train_signals_paths = [
DATASET_PATH + TRAIN + "Inertial Signals/" + signal + "train.txt" for signal in INPUT_SIGNAL_TYPES
]
X_test_signals_paths = [
DATASET_PATH + TEST + "Inertial Signals/" + signal + "test.txt" for signal in INPUT_SIGNAL_TYPES
]
X_train = load_X(X_train_signals_paths)
X_test = load_X(X_test_signals_paths)
以下内容来自Chevalier"附加参数"部分是我混淆的主要原因:
training_data_count = len(X_train) # 7352 training series (with 50% overlap between each serie)
test_data_count = len(X_test) # 2947 testing series
n_steps = len(X_train[0]) # 128 timesteps per series
n_input = len(X_train[0][0]) # 9 input parameters per timestep
我观察到的是,这50%的重叠意味着单独评估的时间间隔重叠,如0-64,32-96,64-128,96等。我知道的一个事实是7352是X_train.txt中的行数。 [0]
和[0][0]
表示它分别选择X_train数组的第0列和X_train的第0列和第0行。我的代码目前正在做的是单独转换每个数据点。这就是为什么当我评估len(X_train[0])
时,我收到1并且len(X_train[0][0])
我收到了错误:
TypeError Traceback (most recent call last)
<ipython-input-255-14523e544e49> in <module>()
2 test_data_count = len(list(X_test))
3 n_steps = len(X_train[0])
----> 4 n_input = len(list(X_train)[0][0])
5 print(training_data_count, test_data_count, n_steps, n_input)
TypeError: object of type 'numpy.int32' has no len()
我想知道如何重新格式化我的数据以匹配txt文件中Chevalier的预期格式? &#34;附加参数&#34;中的数字是什么? Chevalier's git的部分是什么意思,我如何根据我现在的模特量身定制呢?