Tensorflow版本:1.2.1
Python版本:3.5
操作系统:Windows 10
另一张海报在StackOverflow here上询问了同样的问题,他似乎正在使用相同的Udacity Word2Vec教程中的代码。所以,也许我是密集的,但是这个例子的代码是如此繁忙和复杂,以至于我不知道是什么解决了他的问题。
当我致电tf.reduce_means
时发生错误:
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
train_labels, num_sampled, vocabulary_size))
在调用tf.reduce_mean
之前,关键变量具有以下数据类型。
train_dataset.dtype
>> tf.int32
train_labels.dtype
>> tf.int32
valid_dataset.dtype
>> tf.int32
embeddings.dtype
>> tf.float32_ref
softmax_weights.dtype
>> tf.float32_ref
softmax_biases.dtype
>> tf.float32_ref
embed.dtype
>> tf.float32
我在变量train_dataset.dtype
,train_labels.dtype
和valid_dataset.dtype
的定义中尝试了每种数据类型的排列:将它们全部int64
,全部float32
,所有float64
,以及整数和浮点的组合。没有任何效果。我没有尝试更改softmax_weight
和softmax_biases
的数据类型,因为我担心可能会影响优化算法。这些需要浮动来支持反向传播过程中的微积分吗? (Tensorflow通常是一个非常不透明的黑匣子,带有完全没用的文档,所以我可以怀疑但事情永远不会确定。)
错误时的程序流程:
调用reduce_mean
程序控制后转移到文件sampled_softmax_loss()
中的nn_impl.py
,后者又调用_compute_sampled_logits()
:
logits, labels = _compute_sampled_logits(
weights=weights,
biases=biases,
labels=labels,
inputs=inputs,
num_sampled=num_sampled,
num_classes=num_classes,
num_true=num_true,
sampled_values=sampled_values,
subtract_log_q=True,
remove_accidental_hits=remove_accidental_hits,
partition_strategy=partition_strategy,
name=name)
此时我检查传入参数的数据类型并获得以下内容:
weights.dtype
>> tf.float32_ref
biases.dtype
>> tf.float32_ref
labels.dtype
>> tf.float32
inputs.dtype
>> tf.int32
在下一步发生异常时,我被抛入文件StreamWrapper
中的ansitowin32.py
类。运行到最后,我得到以下Traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
489 as_ref=input_arg.is_ref,
--> 490 preferred_dtype=default_dtype)
491 except TypeError as err:
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in internal_convert_to_tensor(value, dtype, name, as_ref, preferred_dtype)
740 if ret is None:
--> 741 ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
742
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\ops.py in _TensorTensorConversionFunction(t, dtype, name, as_ref)
613 "Tensor conversion requested dtype %s for Tensor with dtype %s: %r"
--> 614 % (dtype.name, t.dtype.name, str(t)))
615 return t
ValueError: Tensor conversion requested dtype int32 for Tensor with dtype float32: 'Tensor("sampled_softmax_loss/Reshape_1:0", shape=(?, 1, ?), dtype=float32, device=/device:CPU:0)'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-7-66d378b94a16> in <module>()
34 loss = tf.reduce_mean(
35 tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
---> 36 train_labels, num_sampled, vocabulary_size))
37
38 # Optimizer.
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in sampled_softmax_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
1266 remove_accidental_hits=remove_accidental_hits,
1267 partition_strategy=partition_strategy,
-> 1268 name=name)
1269 sampled_losses = nn_ops.softmax_cross_entropy_with_logits(labels=labels,
1270 logits=logits)
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name)
1005 row_wise_dots = math_ops.multiply(
1006 array_ops.expand_dims(inputs, 1),
-> 1007 array_ops.reshape(true_w, new_true_w_shape))
1008 # We want the row-wise dot plus biases which yields a
1009 # [batch_size, num_true] tensor of true_logits.
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\math_ops.py in multiply(x, y, name)
284
285 def multiply(x, y, name=None):
--> 286 return gen_math_ops._mul(x, y, name)
287
288
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\ops\gen_math_ops.py in _mul(x, y, name)
1375 A `Tensor`. Has the same type as `x`.
1376 """
-> 1377 result = _op_def_lib.apply_op("Mul", x=x, y=y, name=name)
1378 return result
1379
C:\Anaconda3\envs\aind-dog\lib\site-packages\tensorflow\python\framework\op_def_library.py in apply_op(self, op_type_name, name, **keywords)
524 "%s type %s of argument '%s'." %
525 (prefix, dtypes.as_dtype(attrs[input_arg.type_attr]).name,
--> 526 inferred_from[input_arg.type_attr]))
527
528 types = [values.dtype]
TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type int32 of argument 'x'.
这是完整的程序:
# These are all the modules we'll be using later.
# Make sure you can import them before proceeding further.
# %matplotlib inline
from __future__ import print_function
import collections
import math
import numpy as np
import os
import random
import tensorflow as tf
import zipfile
from matplotlib import pylab
from six.moves import range
from six.moves.urllib.request import urlretrieve
from sklearn.manifold import TSNE
print("Working directory = %s\n" % os.getcwd())
def read_data(filename):
"""Extract the first file enclosed in a zip file as a list of words"""
with zipfile.ZipFile(filename) as f:
data = tf.compat.as_str(f.read(f.namelist()[0])).split()
return data
filename = 'text8.zip'
words = read_data(filename)
print('Data size %d' % len(words))
vocabulary_size = 50000
def build_dataset(words):
count = [['UNK', -1]]
count.extend(collections.Counter(words).most_common(vocabulary_size - 1))
dictionary = dict()
# Loop through the keys of the count collection dictionary
# (apparently, zeroing out counts)
for word, _ in count:
dictionary[word] = len(dictionary)
data = list()
unk_count = 0 # count of unknown words
for word in words:
if word in dictionary:
index = dictionary[word]
else:
index = 0 # dictionary['UNK']
unk_count = unk_count + 1
data.append(index)
count[0][1] = unk_count
reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys()))
return data, count, dictionary, reverse_dictionary
data, count, dictionary, reverse_dictionary = build_dataset(words)
print('Most common words (+UNK)', count[:5])
print('Sample data', data[:10])
del words # Hint to reduce memory.
data_index = 0
def generate_batch(batch_size, num_skips, skip_window):
global data_index
assert batch_size % num_skips == 0
assert num_skips <= 2 * skip_window
batch = np.ndarray(shape=(batch_size), dtype=np.int32)
labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32)
span = 2 * skip_window + 1 # [ skip_window target skip_window ]
buffer = collections.deque(maxlen=span)
for _ in range(span):
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
for i in range(batch_size // num_skips):
target = skip_window # target label at the center of the buffer
targets_to_avoid = [ skip_window ]
for j in range(num_skips):
while target in targets_to_avoid:
target = random.randint(0, span - 1)
targets_to_avoid.append(target)
batch[i * num_skips + j] = buffer[skip_window]
labels[i * num_skips + j, 0] = buffer[target]
buffer.append(data[data_index])
data_index = (data_index + 1) % len(data)
return batch, labels
print('data:', [reverse_dictionary[di] for di in data[:8]])
for num_skips, skip_window in [(2, 1), (4, 2)]:
data_index = 0
batch, labels = generate_batch(batch_size=8, num_skips=num_skips, skip_window=skip_window)
print('\nwith num_skips = %d and skip_window = %d:' % (num_skips, skip_window))
print(' batch:', [reverse_dictionary[bi] for bi in batch])
print(' labels:', [reverse_dictionary[li] for li in labels.reshape(8)])
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
# We pick a random validation set to sample nearest neighbors. here we limit the
# validation samples to the words that have a low numeric ID, which by
# construction are also the most frequent.
valid_size = 16 # Random set of words to evaluate similarity on.
valid_window = 100 # Only pick dev samples in the head of the distribution.
valid_examples = np.array(random.sample(range(valid_window), valid_size))
num_sampled = 64 # Number of negative examples to sample.
graph = tf.Graph()
with graph.as_default(), tf.device('/cpu:0'):
# Input data.
train_dataset = tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
# Variables.
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
softmax_weights = tf.Variable(
tf.truncated_normal([vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
softmax_biases = tf.Variable(tf.zeros([vocabulary_size]))
# Model.
# Look up embeddings for inputs.
embed = tf.nn.embedding_lookup(embeddings, train_dataset)
# Compute the softmax loss, using a sample of the negative labels each time.
loss = tf.reduce_mean(
tf.nn.sampled_softmax_loss(softmax_weights, softmax_biases, embed,
train_labels, num_sampled, vocabulary_size))
# Optimizer.
# Note: The optimizer will optimize the softmax_weights AND the embeddings.
# This is because the embeddings are defined as a variable quantity and the
# optimizer's `minimize` method will by default modify all variable quantities
# that contribute to the tensor it is passed.
# See docs on `tf.train.Optimizer.minimize()` for more details.
optimizer = tf.train.AdagradOptimizer(1.0).minimize(loss)
# Compute the similarity between minibatch examples and all embeddings.
# We use the cosine distance:
norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True))
normalized_embeddings = embeddings / norm
valid_embeddings = tf.nn.embedding_lookup(
normalized_embeddings, valid_dataset)
similarity = tf.matmul(valid_embeddings, tf.transpose(normalized_embeddings))
答案 0 :(得分:1)
我有同样的问题,看起来传递给损失函数的两个参数被交换。 如果你看一下'sample_softmax_loss'(https://www.tensorflow.org/api_docs/python/tf/nn/sampled_softmax_loss)的张量流描述:
sampled_softmax_loss(
weights,
biases,
labels,
inputs,
num_sampled,
num_classes,
num_true=1,
sampled_values=None,
remove_accidental_hits=True,
partition_strategy='mod',
name='sampled_softmax_loss'
)
第三个预期参数是'标签'和第四个'输入'。在提供的代码中,这两个参数似乎已被切换。我有点疑惑这是怎么回事。也许这在旧版TF中有所不同。无论如何,交换这两个参数将解决问题。