Question

我有一个语义分段问题，如果我可以为一个特征向量添加多个标签，那将是非常好的。所以我的数据的某些部分属于1,2和3级（其他部分只属于一个类，有些甚至不属于任何类......）。

我认为一个类似但更简单的玩具问题是构建一个神经网络，它以二进制格式获取一个数字作为输入特征，并且应该决定它是否可以被2,3，两者或两者都整除。

我尝试了什么

我用nolearn构建了一个有两个输出神经元的网络（一个用于＆＃34;可以被2＆＃34整除;另一个用于＆＃34;可以被3＆＃34整除;。< / p>

请注意，我知道我可以，对于这个简单的例子，只需添加两个类＆＃34;可以被＆＃34;并且＆＃34;可以被2或3＆＃34;整除。但是，我只创建了一个更复杂问题的例子，我没有这种可能性。

输出层可能不是softmax层，因为我不想得到1的输出总和（但是0,1或2）。问题是我不知道我的标签矢量应该是什么样子。通常，我只给label_vector = [class_for_first, class_for_second, ...]，但这次我需要一个类列表。我该如何调整呢？

（没有必要使用nolearn。纯粹的烤宽面条解决方案也没问题。）

#!/usr/bin/env python

"""Neural Network to decide for numbers in 0..15 if they are divisble by
   2, 3, both or none.
"""

import lasagne
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet

import numpy


def nn_example():
    feature_vectors, labels = [], []
    for i in range(2):
        for j in range(2):
            for k in range(2):
                for l in range(2):
                    feature_vectors.append([i, j, k, l])
                    sum_val = 2**0 * i + 2**1 * j + 2**2 * k + 2**3 * l
                    if sum_val % 2 == 0 and sum_val % 3 != 0:
                        # Only output node for '2' should be one
                        labels.append(0)
                    elif sum_val % 2 != 0 and sum_val % 3 == 0:
                        # Only output node for '3' should be one
                        labels.append(1)
                    elif sum_val % 2 != 0 and sum_val % 3 != 0:
                        # _ALL_ output should be zero
                        labels.append(0)  # dummy value
                    else:
                        # It is divisible by 2 and 3
                        # _BOTH_ output nodes should be 1
                        labels.append(1)  # dummy value
    feature_vectors = numpy.array(feature_vectors, dtype=numpy.float32)
    labels = numpy.array(labels, dtype=numpy.int32)
    net1 = NeuralNet(layers=[('input', layers.InputLayer),
                             ('hidden', layers.DenseLayer),
                             ('hidden2', layers.DenseLayer),
                             ('output', layers.DenseLayer),
                             ],
                     # layer parameters:
                     input_shape=(None, 4),
                     hidden_num_units=3,
                     hidden2_num_units=2,
                     output_nonlinearity=lasagne.nonlinearities.sigmoid,
                     output_num_units=2,

                     # optimization method:
                     update=nesterov_momentum,
                     update_learning_rate=0.01,
                     update_momentum=0.9,

                     max_epochs=1000,
                     verbose=1,
                     )

    # Train the network
    net1.fit(feature_vectors, labels)

    # Try the network
    print("Predicted: %s" % str(net1.predict_proba([[0, 0, 1, 0]])))

if __name__ == '__main__':
    nn_example()

Answer 1

你的标签应该用矩阵（num_samples，num_classes）编码，所有条目都是0或1.从sigmoid输出层激活并计算交叉熵：

-T.sum(y * T.log(z) + (1 - y) * T.log(1 - z))

我可以为带有烤宽面条的一个特征向量使用多个标签吗？

我尝试了什么

1 个答案: