AttributeError:“ numpy.ndarray”对象在tockenizer_left.texts_to_sequences(x_left)中没有属性“ lower”

时间:2019-07-19 10:47:43

标签: python tensorflow deep-learning lstm tokenize

我尝试过将数据拆分为令牌。所有数据均小写。 我想建立曼哈顿LSTM模型。

我尝试在Tokenizer()中添加一些参数。 例如:

num_words = max_nb_words

filters ='!“#$%&()* +,-。/ :; <=>?@ [] ^ _`{|}〜'

lower = True


max_nb_words = 50000
max_seq_length = max(max([len(s) for s in x_left]),max([len(s) for s in x_right]))

tockenizer_left = Tokenizer(num_words=max_nb_words, filters='!"#$%&()*+,-./:;<=>?@[\]^_`{|}~', lower=True)
tockenizer_left.fit_on_texts(data_train['Data_Name_left'].values)


x_left_tokens = tockenizer_left.texts_to_sequences(x_left)
x_left_pad = pad_sequences(tockenizer_left, maxlen=max_seq_length)

tockenizer_right = Tokenizer()
tockenizer_right.fit_on_texts(data_train['Data_Name_right'].values)

x_right_tokens = tockenizer_right.texts_to_sequences(x_right)
x_right_pad = pad_sequences(x_right_tokens,xlen=max_seq_length)

vocab_size = max(len(tockenizer_left.word_index) +1, len(tockenizer_right.word_index) +1)

我期望文本序列。

1 个答案:

答案 0 :(得分:0)

答案是-Tockenizer(lower = False)