Word2Vec教程:Tensorflow TypeError:输入' y' of' Mul' Op的类型为float32,与参数类型int32不匹配' x'

时间:2017-10-01 19:37:54

标签: tensorflow typeerror word2vec

Tensorflow版本:1.2.1
Python版本:3.5
操作系统:Windows 10

另一张海报在StackOverflow here上询问了同样的问题,他似乎正在使用相同的Udacity Word2Vec教程中的代码。所以,也许我是密集的,但是这个例子的代码是如此繁忙和复杂,以至于我不知道是什么解决了他的问题。

当我致电tf.reduce_means时发生错误:

loss = tf.reduce_mean(
    tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
                               train_labels, num_sampled, vocabulary_size))

在调用tf.reduce_mean之前,关键变量具有以下数据类型。

  

train_dataset.dtype
  >> tf.int32
  train_labels.dtype
  >> tf.int32
  valid_dataset.dtype
  >> tf.int32
  embeddings.dtype
  >> tf.float32_ref
  softmax_weights.dtype
  >> tf.float32_ref
  softmax_biases.dtype
  >> tf.float32_ref
  embed.dtype
  >> tf.float32

我在变量train_dataset.dtypetrain_labels.dtypevalid_dataset.dtype的定义中尝试了每种数据类型的排列:将它们全部int64,全部float32,所有float64,以及整数和浮点的组合。没有任何效果。我没有尝试更改softmax_weightsoftmax_biases的数据类型,因为我担心可能会影响优化算法。这些需要浮动来支持反向传播过程中的微积分吗? (Tensorflow通常是一个非常不透明的黑匣子,带有完全没用的文档,所以我可以怀疑但事情永远不会确定。)

错误时的程序流程:

调用reduce_mean程序控制后转移到文件sampled_softmax_loss()中的nn_impl.py,后者又调用_compute_sampled_logits()

  logits, labels = _compute_sampled_logits(
      weights=weights,
      biases=biases,
      labels=labels,
      inputs=inputs,
      num_sampled=num_sampled,
      num_classes=num_classes,
      num_true=num_true,
      sampled_values=sampled_values,
      subtract_log_q=True,
      remove_accidental_hits=remove_accidental_hits,
      partition_strategy=partition_strategy,
      name=name)

此时我检查传入参数的数据类型并获得以下内容:

  

weights.dtype
  >> tf.float32_ref
  biases.dtype
  >> tf.float32_ref
  labels.dtype
  >> tf.float32
  inputs.dtype
  >> tf.int32

在下一步发生异常时,我被抛入文件StreamWrapper中的ansitowin32.py类。运行到最后,我得到以下Traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    489                 as_ref=input_arg.is_ref,
--> 490                 preferred_dtype=default_dtype)
    491           except TypeError as err:

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
    740         if ret is None:
--> 741           ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
    742 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
    613         "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614         % (dtype.name, t.dtype.name, str(t)))
    615   return t

ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
     34     loss = tf.reduce_mean(
     35       tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36                                train_labels, num_sampled, vocabulary_size))
     37 
     38     # Optimizer.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
   1266       remove_accidental_hits=remove_accidental_hits,
   1267       partition_strategy=partition_strategy,
-> 1268       name=name)
   1269   sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
   1270                                                             logits=logits)

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
   1005     row_wise_dots = math_ops.multiply(
   1006         array_ops.expand_dims(inputs, 1),
-> 1007         array_ops.reshape(true_w, new_true_w_shape))
   1008     # We want the row-wise dot plus biases which yields a
   1009     # [batch_size, num_true] tensor of true_logits.

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
    284 
    285 def multiply(x, y, name=None):
--> 286   return gen_math_ops._mul(x, y, name)
    287 
    288 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
   1375     A `Tensor`. Has the same type as `x`.
   1376   """
-> 1377   result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
   1378   return result
   1379 

C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
    524                   "%s type %s of argument '%s'." %
    525                   (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526                    inferred_from[input_arg.type_attr]))
    527 
    528           types = [values.dtype]

TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.

这是完整的程序:

# These are all the modules we'll be using later. 
# Make sure you can import them before proceeding further.

# %matplotlib inline

from __future__ import print_function
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import zipfile
from matplotlib import pylab
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE

print("Working directory = %s\n" % os.getcwd())

def read_data(filename):
    """Extract the first file enclosed in a zip file as a list of words"""
    with zipfile.ZipFile(filename) as f:
        data = tf.compat.as_str(f.read(f.namelist()[0])).split()
    return data

filename = 'text8.zip'

words = read_data(filename)
print('Data size %d' % len(words))

vocabulary_size = 50000

def build_dataset(words):
    count = [['UNK', -1]]
    count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
    dictionary = dict()
    # Loop through the keys of the count collection dictionary
    # (apparently, zeroing out counts)
    for word, _ in count:
        dictionary[word] = len(dictionary)
    data = list()
    unk_count = 0  # count of unknown words
    for word in words:
        if word in dictionary:
            index = dictionary[word]
        else:
            index = 0  # dictionary['UNK']
            unk_count = unk_count + 1
        data.append(index)
    count[0][1] = unk_count
    reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
    return data, count, dictionary, reverse_dictionary


data, count, dictionary, reverse_dictionary = build_dataset(words)
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
del words  # Hint to reduce memory.

data_index = 0

def generate_batch(batch_size, num_skips, skip_window):
    global data_index
    assert batch_size % num_skips == 0
    assert num_skips <= 2 * skip_window
    batch = np.ndarray(shape=(batch_size), dtype=np.int32)
    labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
    span = 2 * skip_window + 1 # [ skip_window target skip_window ]
    buffer = collections.deque(maxlen=span)
    for _ in range(span):
        buffer.append(data[data_index])
        data_index = (data_index + 1) % len(data)
    for i in range(batch_size // num_skips):
        target = skip_window  # target label at the center of the buffer
        targets_to_avoid = [ skip_window ]
        for j in range(num_skips):
            while target in targets_to_avoid:
                target = random.randint(0, span - 1)
            targets_to_avoid.append(target)
            batch[i * num_skips + j] = buffer[skip_window]
            labels[i * num_skips + j, 0] = buffer[target]
        buffer.append(data[data_index])
        data_index = (data_index + 1) % len(data)
    return batch, labels

print('data:', [reverse_dictionary[di] for di in data[:8]])

for num_skips, skip_window in [(2, 1), (4, 2)]:
    data_index = 0
    batch, labels = generate_batch(batch_size=8, num_skips=num_skips, skip_window=skip_window)
    print('\nwith num_skips = %d and skip_window = %d:' % (num_skips, skip_window))
    print('    batch:', [reverse_dictionary[bi] for bi in batch])
    print('    labels:', [reverse_dictionary[li] for li in labels.reshape(8)])

batch_size = 128
embedding_size = 128  # Dimension of the embedding vector.
skip_window = 1  # How many words to consider left and right.
num_skips = 2  # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16  # Random set of words to evaluate similarity on.
valid_window = 100  # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64  # Number of negative examples to sample.

graph = tf.Graph()

with graph.as_default(), tf.device('/cpu:0'):
    # Input data.
    train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
    train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
    valid_dataset = tf.constant(valid_examples, dtype=tf.int32)

    # Variables.
    embeddings = tf.Variable(
        tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
    softmax_weights = tf.Variable(
        tf.truncated_normal([vocabulary_size, embedding_size],
                            stddev=1.0 / math.sqrt(embedding_size)))
    softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))

    # Model.
    # Look up embeddings for inputs.
    embed = tf.nn.embedding_lookup(embeddings, train_dataset)
    # Compute the softmax loss, using a sample of the negative labels each time.
    loss = tf.reduce_mean(
        tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
                                   train_labels, num_sampled, vocabulary_size))

    # Optimizer.
    # Note: The optimizer will optimize the softmax_weights AND the embeddings.
    # This is because the embeddings are defined as a variable quantity and the
    # optimizer's `minimize` method will by default modify all variable quantities
    # that contribute to the tensor it is passed.
    # See docs on `tf.train.Optimizer.minimize()` for more details.
    optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)

    # Compute the similarity between minibatch examples and all embeddings.
    # We use the cosine distance:
    norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
    normalized_embeddings = embeddings / norm
    valid_embeddings = tf.nn.embedding_lookup(
        normalized_embeddings, valid_dataset)
    similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))

1 个答案:

答案 0 :(得分:1)

我有同样的问题,看起来传递给损失函数的两个参数被交换。 如果你看一下'sample_softmax_loss'(https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss)的张量流描述:

sampled_softmax_loss(
    weights,
    biases,
    labels,
    inputs,
    num_sampled,
    num_classes,
    num_true=1,
    sampled_values=None,
    remove_accidental_hits=True,
    partition_strategy='mod',
    name='sampled_softmax_loss'
)

第三个预期参数是'标签'和第四个'输入'。在提供的代码中,这两个参数似乎已被切换。我有点疑惑这是怎么回事。也许这在旧版TF中有所不同。无论如何,交换这两个参数将解决问题。