Theano中的1-of-k(单热)编码

时间:2015-09-30 13:10:46

标签: theano

我正在为Numpy做thisseq是带索引的列表。即这实现了1-of-k编码(也称为one-hot)。

def 1_of_k(seq, num_classes):
  num_frames = len(seq)
  m = np.zeros((num_frames, num_classes))
  m[np.arange(num_frames), seq] = 1
  return m

我如何在Theano做同样的事情? (最有效的解决方案,对CUDA也很有效。)

3 个答案:

答案 0 :(得分:3)

有一个内置函数来执行此操作(theano.tensor.extra_ops.to_one_hot)但它仍然比在numpy中执行它慢得多。如果你的任务可行,你可能最好在Theano之外计算这个并将密集的结果作为输入传递而不是只传递索引。

这里有一些代码说明了三种numpy方法和四种Theano方法。此代码包含Albert(numpy_1_of_k_3 / compile_theano_1_of_k_3)和eickenberg(numpy_1_of_k_2 / compile_theano_1_of_k_4)提供的答案,以供比较。

事实证明,内置的Theano方法(compile_theano_1_of_k_2)使用与我自己的尝试大致相同的代码(numpy_1_of_k_1 / compile_theano_1_of_k_1)。

import timeit
import numpy as np
import theano
import theano.tensor as tt
import theano.tensor.extra_ops


def numpy_1_of_k_1(seq, num_classes):
    num_frames = len(seq)
    m = np.zeros((num_frames, num_classes))
    m[np.arange(num_frames), seq] = 1
    return m


def numpy_1_of_k_2(seq, num_classes):
    return seq[:, np.newaxis] == np.arange(num_classes)


def numpy_1_of_k_3(seq, num_classes):
    shape = [seq.shape[i] for i in range(seq.ndim)] + [num_classes]
    eye = np.eye(num_classes)
    return eye[seq].reshape(shape)


def compile_theano_1_of_k_1():
    seq = tt.lvector()
    num_classes = tt.lscalar()
    num_frames = seq.shape[0]
    m = tt.zeros((num_frames, num_classes))
    m = tt.set_subtensor(m[tt.arange(num_frames), seq], 1)
    return theano.function([seq, num_classes], outputs=m)


def compile_theano_1_of_k_2():
    seq = tt.lvector()
    num_classes = tt.lscalar()
    return theano.function([seq, num_classes], outputs=theano.tensor.extra_ops.to_one_hot(seq, num_classes))


def compile_theano_1_of_k_3():
    seq = tt.lvector()
    num_classes = tt.lscalar()
    shape = [seq.shape[i] for i in range(seq.ndim)] + [num_classes]
    eye = tt.eye(num_classes)
    m = eye[seq].reshape(shape)
    return theano.function([seq, num_classes], outputs=m)


def compile_theano_1_of_k_4():
    seq = tt.lvector()
    num_classes = tt.lscalar()
    one_hot = tt.eq(seq.reshape((-1, 1)), tt.arange(num_classes))
    return theano.function([seq, num_classes], outputs=one_hot)


def main(iterations):
    theano_1_of_k_1 = compile_theano_1_of_k_1()
    theano_1_of_k_2 = compile_theano_1_of_k_2()
    theano_1_of_k_3 = compile_theano_1_of_k_3()
    theano_1_of_k_4 = compile_theano_1_of_k_4()

    test_seq = np.array([0, 1, 2, 0, 1, 2])
    test_num_classes = 4
    test_functions = [numpy_1_of_k_1, numpy_1_of_k_2, numpy_1_of_k_3, theano_1_of_k_1, theano_1_of_k_2, theano_1_of_k_3,
                      theano_1_of_k_4]
    test_results = [test_function(test_seq, test_num_classes) for test_function in test_functions]

    for a, b in zip(test_results[:-1], test_results[1:]):
        assert np.all(np.equal(a, b)), (a, b)

    data = []
    for _ in xrange(iterations):
        num_classes = np.random.randint(100) + 1
        seq = np.random.randint(num_classes, size=(np.random.randint(100) + 1))
        data.append((seq, num_classes))

    for test_function in test_functions:
        start = timeit.default_timer()
        total = 0
        for seq, num_classes in data:
            total += test_function(seq, num_classes).sum()
        print timeit.default_timer() - start, total


main(100000)

使用笔记本电脑并在CPU上运行Theano代码,我会在几秒钟内得到以下时间:

numpy_1_of_k_1    1.0645
numpy_1_of_k_2    1.4018
numpy_1_of_k_3    1.6131
theano_1_of_k_1   6.3542
theano_1_of_k_2   6.4628
theano_1_of_k_3   6.5637
theano_1_of_k_4   5.4588

因此,在numpy中,身份方法比简单广播慢,后者比零点集慢。然而,在Theano中,相对表现顺序不同;这里简单的广播方法是最快的。

这些是非常小的测试用例,因此相对性能可能会在更大的矩阵或GPU上运行时有所不同。

答案 1 :(得分:2)

这不是一个简单的广播吗?

savedObject

答案 2 :(得分:1)

我的解决方案:

def class_idx_seq_to_1_of_k(seq, num_classes, dtype="float32"):
  shape = [seq.shape[i] for i in range(seq.ndim)] + [num_classes]
  eye = T.eye(num_classes, dtype=dtype)
  m = eye[T.cast(seq, 'int32')].reshape(shape)
  return m