我是数据科学的新手,我正在探索ElMo嵌入技术的概念。 调用elmo方法时,我使用了signature = tokens选项。但是,当我传递给ElMo的输入字符串张量具有两个以上的元素时,我会得到关于张量形状的错误。
我正在使用Tensorflow版本1.13.1。我尝试更改输入,但仅在字符串的长度为2个元素时才有效。 我无法找出形状不匹配的地方。有人可以提供对此错误的解决方案,也可以解释为什么此“令牌”签名以这种方式起作用吗?
import tensorflow_hub as hub
import tensorflow as tf
import numpy as np
elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)
tokens_input = [["the", "cat", "is", "on", "the", "mat"],["hello"],
["dogs", "are", "in", "the", "fog", ""],["happiness",
"cannot", "be", "bought", "by", "us"]]
print(tokens_input,type(tokens_input))
tokens_length = len(tokens_input)
print(tokens_length)
#get the maximum length of elements in tokens_input
strlen = len(max(tokens_input, key=len))
print(strlen)
#append array elements which have lower length than strlen with space
for i in range(tokens_length):
if len(tokens_input[i])<strlen:
for j in range(strlen-len(tokens_input[i])):
tokens_input[i].insert(j,"null")
np_arr = np.array(tokens_input)
print("np_arr details are ",np_arr,np_arr.shape,np_arr.dtype)
tokens_shp = tf.strings.length(np_arr).shape
print("Tokens shape",tokens_shp)
embeddings = elmo(inputs={
"tokens": tokens_input,
"sequence_len": tokens_shp
},signature="tokens",
as_dict=True)["elmo"]
我的代码中elmo方法的输入字符串是tokens_input。我可以通过将空字符串附加到数组中来获得tokens_input的正确形状(在这种情况下为((4,6))) 但是当我运行上面的代码时,出现以下错误:
InvalidArgumentError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py in import_graph_def(graph_def, input_map, return_elements, name, op_dict, producer_op_list)
425 results = c_api.TF_GraphImportGraphDefWithResults(
--> 426 graph._c_graph, serialized, options) # pylint: disable=protected-access
427 results = c_api_util.ScopedTFImportGraphDefResults(results)
InvalidArgumentError: Dimension 0 in both shapes must be equal, but are 2 and 4. Shapes are [2] and [4]. for 'module_apply_tokens/bilm/RNN_0/RNN/MultiRNNCell/Cell0/rnn/while/Select' (op: 'Select') with input shapes: [2], [4,512], [?,512].
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
7 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/importer.py in import_graph_def(graph_def, input_map, return_elements, name, op_dict, producer_op_list)
428 except errors.InvalidArgumentError as e:
429 # Convert to ValueError for backwards compatibility.
--> 430 raise ValueError(str(e))
431
432 # Create _DefinedFunctions for any imported functions.
ValueError: Dimension 0 in both shapes must be equal, but are 2 and 4. Shapes are [2] and [4]. for 'module_apply_tokens/bilm/RNN_0/RNN/MultiRNNCell/Cell0/rnn/while/Select' (op: 'Select') with input shapes: [2], [4,512], [?,512].