我正在使用tensorrec,一个基于tensorflow构建的推荐框架,来构建推荐算法。当我使用所有数据时,这是有效的。如果我将数据拆分为训练和测试,则会失败。
这是我分割数据和训练模型的方法。
# shuffle and split data in training and testing
TRAIN_SIZE = 0.8
msk_user = np.random.rand(user_features.shape[0]) < TRAIN_SIZE
msk_item = np.random.rand(item_features.shape[0]) < TRAIN_SIZE
msk_interactions = np.random.rand(interactions.shape[0]) < TRAIN_SIZE
user_features_train = user_features[msk_user]
user_features_test = user_features[~msk_user]
item_features_train = item_features[msk_item]
item_features_test = item_features[~msk_item]
interactions_train = interactions[msk_interactions]
interactions_test = interactions[~msk_interactions]
# Build the model with default parameters
model = tensorrec.TensorRec()
model.fit(interactions_train, user_features_train, item_features_train, epochs=3, verbose=True)
model.fit使用scr-sparse matrices作为输入。我已经尝试使用train_test_split,它应该适用于稀疏矩阵,但这也不起作用。
这是我收到的错误(请注意,文件的路径已更改):
C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\ops\gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Traceback (most recent call last):
File "c:\path\to\file\main.py", line 275, in <module>
model.fit(interactions_train, user_features_train, item_features_train, epochs=3, verbose=True)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorrec\tensorrec.py", line 533, in fit
n_sampled_items=n_sampled_items)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorrec\tensorrec.py", line 623, in fit_partial
feed_dict=feed_dict
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 908, in run
run_metadata_ptr)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1143, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1324, in _do_run
run_metadata)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\client\session.py", line 1343, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: indices[0] = 122 is not in [0, 107)
[[Node: GatherV2_5 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Sum_3, strided_slice_1, gradients/GatherV2_1_grad/concat/axis)]]
Caused by op 'GatherV2_5', defined at:
File "C:\Users\user\.vscode\extensions\ms-python.python-2018.5.0\pythonFiles\PythonTools\visualstudio_py_launcher.py", line 91, in <module>
vspd.debug(filename, port_num, debug_id, debug_options, currentPid, run_as)
File "C:\Users\user\.vscode\extensions\ms-python.python-2018.5.0\pythonFiles\PythonTools\visualstudio_py_debugger.py", line 2625, in debug
exec_file(file, globals_obj)
File "C:\Users\user\.vscode\extensions\ms-python.python-2018.5.0\pythonFiles\PythonTools\visualstudio_py_util.py", line 119, in exec_file
exec_code(code, file, global_variables)
File "C:\Users\user\.vscode\extensions\ms-python.python-2018.5.0\pythonFiles\PythonTools\visualstudio_py_util.py", line 95, in exec_code
exec(code_obj, global_variables)
File "c:\path\to\file\main.py", line 275, in <module>
model.fit(interactions_train, user_features_train, item_features_train, epochs=3, verbose=True)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorrec\tensorrec.py", line 533, in fit
n_sampled_items=n_sampled_items)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorrec\tensorrec.py", line 601, in fit_partial
self._build_tf_graph(n_user_features=n_user_features, n_item_features=n_item_features)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorrec\tensorrec.py", line 438, in _build_tf_graph
tf_x_item=tf_x_item)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorrec\recommendation_graphs.py", line 56, in bias_prediction_serial
gathered_item_biases = tf.gather(tf_projected_item_biases, tf_x_item)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\ops\array_ops.py", line 2705, in gather
return gen_array_ops.gather_v2(params, indices, axis, name=name)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\ops\gen_array_ops.py", line 3530, in gather_v2
"GatherV2", params=params, indices=indices, axis=axis, name=name)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 3309, in create_op
op_def=op_def)
File "C:\Users\user\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py", line 1669, in __init__
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
InvalidArgumentError (see above for traceback): indices[0] = 122 is not in [0, 107)
[[Node: GatherV2_5 = GatherV2[Taxis=DT_INT32, Tindices=DT_INT64, Tparams=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](Sum_3, strided_slice_1, gradients/GatherV2_1_grad/concat/axis)]]
其他信息: 我在Windows计算机上使用Python 3.6.4和Anaconda。我的Tensorflow版本是1.8.0-dev20180329(没有GPU支持)