张量流中稀疏矩阵乘法的稠密向量的正确方法是什么?
根据文档tf.matmul支持稀疏矩阵乘法,所以我需要使用tf.sparse_matmul吗? (而且还存在tf.sparse_tensor_dense_matmul,因此在每种情况下都应使用它们?)
我还需要将稀疏矩阵转换为tf.SparseTensor吗?同样,tf.convert_to_tensor_or_sparse_tensor在做什么以及如何将密集的numpy矩阵或scipy稀疏矩阵转换为张量流合适的输入也不清楚。
这是我尝试过的(定时用于CPU):
import numpy as np
import tensorflow as tf
np.random.seed(2018)
# Parameters
n = 10*1000
m = 4*1000
p = 0.1
%%time
# Data preparation
dense_vector = np.random.rand(1,n).astype(np.float32)
print('dense_vector.shape', dense_vector.shape)
#print('dense_vector:')
#print(dense_vector)
dense_matrix = np.random.rand(n*m)
idx = np.random.choice(range(n*m), int((1.0-p)*n*m), replace=False)
dense_matrix[idx] = 0.0
dense_matrix = dense_matrix.reshape(n,m).astype(np.float32)
print('dense_matrix.shape', dense_matrix.shape)
#print('dense_matrix:')
#print(dense_matrix)
dense_vector.shape (1, 10000)
dense_matrix.shape (10000, 4000)
CPU times: user 9.8 s, sys: 2.38 s, total: 12.2 s
Wall time: 12.2 s
%%time
# Dense vector on dense matrix multiplication using numpy
res = dense_vector @ dense_matrix
print('res.shape', res.shape)
#print('res:')
#print(res)
%%time
# Dense vector on dense matrix multiplication using tensorflow tf.matmul V1
dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor(dense_matrix, np.float32)
res_tf = tf.matmul(dense_vector_tf, dense_matrix_tf)
with tf.Session() as sess:
res = sess.run(res_tf)
print('res.shape', res.shape)
#print('res:')
#print(res)
res.shape (1, 4000)
CPU times: user 1.88 s, sys: 1.82 s, total: 3.7 s
Wall time: 3.54 s
%%time
# Dense vector on dense matrix multiplication using tensorflow tf.matmul V2
dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor(dense_matrix, np.float32)
res_tf = tf.matmul(dense_vector_tf, dense_matrix_tf,
a_is_sparse=False,
b_is_sparse=True)
with tf.Session() as sess:
res = sess.run(res_tf)
print('res.shape', res.shape)
#print('res:')
#print(res)
res.shape (1, 4000)
CPU times: user 4.91 s, sys: 4.28 s, total: 9.19 s
Wall time: 9.07 s
%%time
# Dense vector on sparse matrix multiplication using tensorflow tf.sparse_matmul V1
dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor(dense_matrix, np.float32)
res_tf = tf.sparse_matmul(dense_vector_tf, dense_matrix_tf,
a_is_sparse=False,
b_is_sparse=True)
with tf.Session() as sess:
res = sess.run(res_tf)
print('res.shape', res.shape)
#print('res:')
#print(res)
res.shape (1, 4000)
CPU times: user 4.82 s, sys: 4.18 s, total: 8.99 s
Wall time: 9 s
%%time
# Dense vector on sparse matrix multiplication using tensorflow tf.sparse_matmul V2
dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor_or_sparse_tensor(dense_matrix, np.float32)
res_tf = tf.sparse_matmul(dense_vector_tf, dense_matrix_tf,
a_is_sparse=False,
b_is_sparse=True)
with tf.Session() as sess:
res = sess.run(res_tf)
print('res.shape', res.shape)
#print('res:')
#print(res)
res.shape (1, 4000)
CPU times: user 5.07 s, sys: 4.53 s, total: 9.6 s
Wall time: 9.61 s
使用稀疏矩阵我也看不到任何改善,我在做什么错了?