Question

张量流中稀疏矩阵乘法的稠密向量的正确方法是什么？

根据文档tf.matmul支持稀疏矩阵乘法，所以我需要使用tf.sparse_matmul吗？（而且还存在tf.sparse_tensor_dense_matmul，因此在每种情况下都应使用它们？）

我还需要将稀疏矩阵转换为tf.SparseTensor吗？同样，tf.convert_to_tensor_or_sparse_tensor在做什么以及如何将密集的numpy矩阵或scipy稀疏矩阵转换为张量流合适的输入也不清楚。

这是我尝试过的（定时用于CPU）：

import numpy as np
import tensorflow as tf

np.random.seed(2018)

# Parameters
n = 10*1000
m = 4*1000
p = 0.1

%%time

# Data preparation
dense_vector = np.random.rand(1,n).astype(np.float32)
print('dense_vector.shape', dense_vector.shape)
#print('dense_vector:')
#print(dense_vector)

dense_matrix = np.random.rand(n*m)
idx = np.random.choice(range(n*m), int((1.0-p)*n*m), replace=False)
dense_matrix[idx] = 0.0
dense_matrix = dense_matrix.reshape(n,m).astype(np.float32)
print('dense_matrix.shape', dense_matrix.shape)
#print('dense_matrix:')
#print(dense_matrix)

dense_vector.shape (1, 10000)
dense_matrix.shape (10000, 4000)
CPU times: user 9.8 s, sys: 2.38 s, total: 12.2 s
Wall time: 12.2 s

%%time

# Dense vector on dense matrix multiplication using numpy

res = dense_vector @ dense_matrix
print('res.shape', res.shape)
#print('res:')
#print(res)

%%time

# Dense vector on dense matrix multiplication using tensorflow tf.matmul V1

dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor(dense_matrix, np.float32)
res_tf = tf.matmul(dense_vector_tf, dense_matrix_tf)

with tf.Session() as sess:
    res = sess.run(res_tf)
    print('res.shape', res.shape)
    #print('res:')
    #print(res)

res.shape (1, 4000)
CPU times: user 1.88 s, sys: 1.82 s, total: 3.7 s
Wall time: 3.54 s

%%time

# Dense vector on dense matrix multiplication using tensorflow tf.matmul V2

dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor(dense_matrix, np.float32)
res_tf = tf.matmul(dense_vector_tf, dense_matrix_tf,
                   a_is_sparse=False,
                   b_is_sparse=True)

with tf.Session() as sess:
    res = sess.run(res_tf)
    print('res.shape', res.shape)
    #print('res:')
    #print(res)

res.shape (1, 4000)
CPU times: user 4.91 s, sys: 4.28 s, total: 9.19 s
Wall time: 9.07 s

%%time

# Dense vector on sparse matrix multiplication using tensorflow tf.sparse_matmul V1

dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor(dense_matrix, np.float32)
res_tf = tf.sparse_matmul(dense_vector_tf, dense_matrix_tf,
                         a_is_sparse=False,
                         b_is_sparse=True)

with tf.Session() as sess:
    res = sess.run(res_tf)
    print('res.shape', res.shape)
    #print('res:')
    #print(res)

res.shape (1, 4000)
CPU times: user 4.82 s, sys: 4.18 s, total: 8.99 s
Wall time: 9 s

%%time

# Dense vector on sparse matrix multiplication using tensorflow tf.sparse_matmul V2

dense_vector_tf = tf.convert_to_tensor(dense_vector, np.float32)
dense_matrix_tf = tf.convert_to_tensor_or_sparse_tensor(dense_matrix, np.float32)
res_tf = tf.sparse_matmul(dense_vector_tf, dense_matrix_tf,
                         a_is_sparse=False,
                         b_is_sparse=True)

with tf.Session() as sess:
    res = sess.run(res_tf)
    print('res.shape', res.shape)
    #print('res:')
    #print(res)

res.shape (1, 4000)
CPU times: user 5.07 s, sys: 4.53 s, total: 9.6 s
Wall time: 9.61 s

使用稀疏矩阵我也看不到任何改善，我在做什么错了？

张量流中稀疏矩阵乘法的密集矢量

0 个答案: