拟合Keras顺序模型会导致ValueError:无法将NumPy数组转换为张量(不受支持的对象类型numpy.ndarray)

时间:2020-07-04 13:14:01

标签: python arrays numpy tensorflow keras

我有以下列表数组(每部电影的演员):

partial_x_train_actors=array([list([b'victor mclaglen', b'jon hall', b'frances farmer', b'olympe bradna', b'gene lockhart', b'douglass dumbrille', b'francis ford', b'ben welden', b'abner biberman', b'pedro de cordoba', b'rudy robles', b'bobby stone', b'nellie duran', b'james flavin', b'nina campana']),
       list([b'jessica biel', b'ben barnes', b'kristin scott thomas', b'colin firth', b'kimberley nixon', b'katherine parkinson', b'kris marshall', b'christian brassington', b'charlotte riley', b'jim mcmanus', b'pip torrens', b'jeremy hooton', b'joanna bacon', b'maggie hickey', b'georgie glen']),
       list([b'gr\xc3\xa9gori derang\xc3\xa8re', b'anouk grinberg', b'aur\xc3\xa9lien recoing', b'niels arestrup', b'yann collette', b'laure duthilleul', b'david assaraf', b'pascal demolon', b'jean-baptiste iera', b'richard sammel', b'vincent crouzet', b'fred epaud', b'pascal elso', b'nicolas giraud', b'micha\xc3\xabl abiteboul']),
       ...,
       list([b'jason schwartzman', b'mickey rourke', b'brittany murphy', b'john leguizamo', b'patrick fugit', b'mena suvari', b'chloe hunter', b'elisa bocanegra', b'julia mendoza', b'china chow', b'nicholas gonzalez', b'debbie harry', b'josh peck', b'charlotte ayanna', b'eric roberts']),
       list([b'fred kirschenmann', b'daniel salatin', b'joel salatin', b'paul willis', b'chuck wirtz']),
       list([b'jan sebastian', b'tray loren', b'paul muzzcat', b'brad koepenick', b'jerry armstrong', b'ben sebastian', b'reyn hubbard', b'levita gros', b'betty flemming', b'randolph parro', b'susan serigny', b'keith gros', b'rocky dugas', b'sid larrwiere', b'jocelyn boudreaux'])],
      dtype=object)

由于我想将其用作Keras模型的输入,因此必须将列表数组转换为数组数组。为此,我运行下面的代码,摘自this SO question

partial_x_train_actors_array=[]

for i in range(len(partial_x_train_actors)):
    
    partial_x_train_actors_array.append(np.array(list(x for x in partial_x_train_actors[i])))

partial_x_train_actors_array = np.asarray(partial_x_train_actors_array)=
type(partial_x_train_actors_array[0])

现在我明白了:

array([array([b'victor mclaglen', b'jon hall', b'frances farmer',
       b'olympe bradna', b'gene lockhart', b'douglass dumbrille',
       b'francis ford', b'ben welden', b'abner biberman',
       b'pedro de cordoba', b'rudy robles', b'bobby stone',
       b'nellie duran', b'james flavin', b'nina campana'], dtype='|S18'),
       array([b'jessica biel', b'ben barnes', b'kristin scott thomas',
       b'colin firth', b'kimberley nixon', b'katherine parkinson',
       b'kris marshall', b'christian brassington', b'charlotte riley',
       b'jim mcmanus', b'pip torrens', b'jeremy hooton', b'joanna bacon',
       b'maggie hickey', b'georgie glen'], dtype='|S21'),
       array([b'gr\xc3\xa9gori derang\xc3\xa8re', b'anouk grinberg',
       b'aur\xc3\xa9lien recoing', b'niels arestrup', b'yann collette',
       b'laure duthilleul', b'david assaraf', b'pascal demolon',
       b'jean-baptiste iera', b'richard sammel', b'vincent crouzet',
       b'fred epaud', b'pascal elso', b'nicolas giraud',
       b'micha\xc3\xabl abiteboul'], dtype='|S19'),
       ...,
       array([b'jason schwartzman', b'mickey rourke', b'brittany murphy',
       b'john leguizamo', b'patrick fugit', b'mena suvari',
       b'chloe hunter', b'elisa bocanegra', b'julia mendoza',
       b'china chow', b'nicholas gonzalez', b'debbie harry', b'josh peck',
       b'charlotte ayanna', b'eric roberts'], dtype='|S17'),
       array([b'fred kirschenmann', b'daniel salatin', b'joel salatin',
       b'paul willis', b'chuck wirtz'], dtype='|S17'),
       array([b'jan sebastian', b'tray loren', b'paul muzzcat',
       b'brad koepenick', b'jerry armstrong', b'ben sebastian',
       b'reyn hubbard', b'levita gros', b'betty flemming',
       b'randolph parro', b'susan serigny', b'keith gros', b'rocky dugas',
       b'sid larrwiere', b'jocelyn boudreaux'], dtype='|S17')],
      dtype=object)

但这都不足以摆脱输入Tensor的类型,因为出现此错误:

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

我的模型拟合过程

# import the pre-trained model
model = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(model, output_shape=[20], input_shape=[], dtype=tf.string, trainable=True)

# create the neural network structure
model = tf.keras.Sequential(name="English_Google_News_130GB_witout_OOV_tokens")
model.add(hub_layer)
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(i, kernel_regularizer=regularizers.l2(neural_network_parameters['l2_regularization']),
                                        activation=neural_network_parameters['dense_activation']))
model.add(tf.keras.layers.Dropout(neural_network_parameters['dropout_rate']))
model.add(tf.keras.layers.Dense(y_val.shape[1], 

activation=neural_network_parameters['output_activation']))
        
#model.name("English Google News 130GB witout OOV tokens")
print(model.summary())
        
#instantiate Optimizer
optimizer = optimizer_adam_v2(len(partial_x_train_actors_array), validation_split_ratio, i)

model.compile(optimizer=optimizer,
              loss=neural_network_parameters['model_loss'],
              metrics=[neural_network_parameters['model_metric']])

plot_model(model, to_file=os.path.join(os.getcwd(), 'model_three\\network_structure_english_google_news_without_OOV_model_{0}.png'.format(version_data_control)))

history = model.fit([partial_x_train_features, partial_x_train_plot, partial_x_train_actors_array, partial_x_train_reviews],
                        partial_y_train,
                        steps_per_epoch=int(np.ceil((len(partial_x_train_actors_array)*0.8)//16)),
                        epochs=100,
                        batch_size=16,
                        validation_split=0.2
                        verbose=0,
                        callbacks=callback("english_google_news_without_oovtokens", model))

[EDIT]-04.07.2020

我想补充一点,我已经为另一个实验对序列进行了填充,并且上面显示的演员列表被转换为下面的列表

partial_x_train_actors=array([[ 2024,  3228,   451, ..., 18119,     0,     0],
       [ 3230,  7889, 12357, ...,     0,     0,     0],
       [20001, 20001, 20001, ...,     0,     0,     0],
       ...,
       [ 6887, 20001, 15352, ..., 20001, 20001, 20001],
       [10206, 20001,  3426, ..., 20001,     0,     0],
       [ 2969,  5903,   447, ...,     0,     0,     0]])

但是,当我将此列表应用于神经网络的.fit()时,出现以下错误

ValueError: Error when checking input: expected keras_layer_4_input to have 1 dimensions, but got array with shape (39192, 17)

(39192,17)是角色数组的形状

[编辑2]-2020年5月7日

试验1 (失败)

基于对所提供答案的一些建议,我试图更改集线器的输入形状。Keraslayer:

hub_layer = hub.KerasLayer(model, output_shape=[20], input_shape=[len(y_train)], dtype=tf.string, trainable=True)

我使它等于我每个演员,情节,特征,评论的training_input长度#39192数据。

产生错误: enter image description here 从错误中,我可以猜测input_shape应该为[]?

审判2 (失败)

#list of actors (training data) tensors
actors_training_tensors=np.array([tf.convert_to_tensor(partial_x_train_actors[i]) for i in range(len(partial_x_train_actors))])
actors_testing_tensors=np.array([tf.convert_to_tensor(x_val_actors[i]) for i in range(len(x_val_actors))])

再次错误:

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type tensorflow.python.framework.ops.EagerTensor).

我将参与者的输入列表转换为张量。请注意,只有演员列表有问题,因为它们以名称的形式存储在列表[[name1,name2,name3]]中。我对情节,要素或评论输入都没有问题,因为它们被保存为语料库列表。

试验3 (失败)

基于注释,我同样使用了数据API:

data_tf=tf.data.Dataset.from_tensor_slices([partial_x_train_features, partial_x_train_plot, partial_x_train_actors_array, partial_x_train_reviews])

再次出现错误:

ValueError: Can't convert Python sequence with mixed types to Tensor.

因此我进行了搜索,然后找出了questiondocumentation, 我做了以下更改(添加了tf.constant):

data_tf=tf.data.Dataset.from_tensor_slices([tf.constant(partial_x_train_features), tf.constant(partial_x_train_plot), tf.constant(partial_x_train_actors_array), tf.constant(partial_x_train_reviews)])

此外,看来我无法将NumPy字符串数组转换为浮点数的张量。可能在这里,序列的填充起着重要的作用。但是,如果遵循我从中得出的tensorflow文章的this link,您会注意到用户仅提供字节字符串而不提供填充序列作为输入。

请注意,所有这些解决方案只是通过使用“” .join()命令来平整演员列表。但是,参与者只是名字的文本,而不是单独的名字。即使可行,我认为为了获得更好的效果,应该将演员单独命名,因为神经网络无法单独区分名字。

[用于调试的输入数据-问题复制]

万一有人要复制和调试问题,下面我代表我的4个输入层(数据样本)和我遵循的the article from Tensorflow

这是发布了问题的问题的GitHub link。看起来,当我在本地运行代码时,除了随附的GitHub问题中出现的EarlyStopping错误之外,其他所有内容看起来都不错。我将重新检查我使用的数据,因为GitHub链接中提供的数据是要使用的正确数据。

1 个答案:

答案 0 :(得分:3)

您在尝试将numpy.ndarry转换为Tensor时收到此错误。 简而言之,您的数组长度不同,将其转换为Tensor不会被接受。

您需要做的是使x的长度相同,而y的长度相同。

有几种方法可以实现此目的。根据您提供的代码,您可以使用类似以下的代码:以下代码是伪代码,仅用于说明您需要等长数组。

for i in range(len(partial_x_train_actors)):
    
    partial_x_train_actors_array.append(np.array(list(x for x in partial_x_train_actors[i:5]))) # for example getting only 5 elements from the list, you can change as per your need

另一种方法是使用tf.data API,使用Generators将您的数据集转换为tf.data.Dataset,然后使用tf.data.Dataset.padded_batch来填充批处理以使您的数据集等长。这是API link

[问题编辑后] 数组形状的第二个问题是由于您已将输入形状编码为[]。

hub_layer = hub.KerasLayer(model, output_shape=[20], input_shape=[], dtype=tf.string, trainable=True)

由于这个原因,您收到输入层期望1维但收到的错误(39192,17)。在model.fit()中,您将x用作

[partial_x_train_features, partial_x_train_plot, partial_x_train_actors_array, partial_x_train_reviews]

我建议您根据数据集而不是[]更改input_shape。

如果您仍然遇到任何问题,我将要求发布Github链接,以便我进行调试以查看实际问题。

[05/07/2020]-更新

我已经调试了您的代码,对您的输入数据进行了一些更改,然后使其正常工作。我使用了tf.data.Dataset.from_generator API来连接您的数据。我已经对损失函数和优化器进行了更改,以便可以调试。您可以根据需要进行更改。另外,请确保输入partial_x_train_reviewspartial_x_train_plotpartial_x_train_features应该看起来像这样。但是,如果要保留旧方法,请相应地更改def generator():方法。让我知道它的进展。我建议,如果您的问题解决了,请下次再提供一个可以方便地调试且无需进行大量更改即可使其正常工作的代码。希望答案对您有帮助。

import tensorflow as tf
import tensorflow_hub as hub

# Train variables
partial_x_train_features = [
    [b'south pago pago victor mclaglen jon hall frances farmer olympe bradna gene lockhart douglass dumbrille francis ford ben welden abner biberman pedro cordoba rudy robles bobby stone nellie duran james flavin nina campana alfred e green treasure hunt adventure adventure'],
    [b'easy virtue jessica biel ben barnes kristin scott thomas colin firth kimberley nixon katherine parkinson kris marshall christian brassington charlotte riley jim mcmanus pip torrens jeremy hooton joanna bacon maggie hickey georgie glen stephan elliott young englishman marry glamorous american brings home meet parent arrive like blast future blow entrenched british stuffiness window comedy romance'],
    [b'fragments antonin gregori derangere anouk grinberg aurelien recoing niels arestrup yann collette laure duthilleul david assaraf pascal demolon jean baptiste iera richard sammel vincent crouzet fred epaud pascal elso nicolas giraud michael abiteboul gabriel le bomin psychiatrist probe mind traumatized soldier attempt unlock secret drove gentle deeply disturbed world war veteran edge insanity drama war'],
    [b'milka film taboos milka elokuva tabuista irma huntus leena suomu matti turunen eikka lehtonen esa niemela sirkka metsasaari tauno lehtihalmes ulla tapaninen toivo tuomainen hellin auvinen salmi rauni mollberg small finnish lapland community milka innocent year old girl live mother miss dead father prays god love haymaking employ drama'],
    [b'sleeping car david naughton judie aronson kevin mccarthy jeff conaway dani minnick ernestine mercer john carl buechler gary brockette steve lundquist billy stevenson michael scott bicknell david coburn nicole hansen tiffany million robert ruth douglas curtis jason david naughton move abandon train car resurrect vicious ghost landlady dead husband mister near fatal encounter comedy horror']]

partial_x_train_plot = [[b'treasure hunt adventure'],
                        [b'young englishman marry glamorous american brings home meet parent arrive like blast future blow entrenched british stuffiness window'],
                        [b'psychiatrist probe mind traumatized soldier attempt unlock secret drove gentle deeply disturbed world war veteran edge insanity'],
                        [b'small finnish lapland community milka innocent year old girl live mother miss dead father prays god love haymaking employ'],
                        [b'jason david naughton move abandon train car resurrect vicious ghost landlady dead husband mister near fatal encounter']]

partial_x_train_actors_array = [[b'victor mclaglen', b'jon hall', b'frances farmer',
                                 b'olympe bradna', b'gene lockhart', b'douglass dumbrille',
                                 b'francis ford', b'ben welden', b'abner biberman',
                                 b'pedro de cordoba', b'rudy robles', b'bobby stone',
                                 b'nellie duran', b'james flavin', b'nina campana'],
                                [b'jessica biel', b'ben barnes', b'kristin scott thomas',
                                 b'colin firth', b'kimberley nixon', b'katherine parkinson',
                                 b'kris marshall', b'christian brassington', b'charlotte riley',
                                 b'jim mcmanus', b'pip torrens', b'jeremy hooton', b'joanna bacon',
                                 b'maggie hickey', b'georgie glen'],
                                [b'gregori derangere', b'anouk grinberg', b'aurelien recoing',
                                 b'niels arestrup', b'yann collette', b'laure duthilleul',
                                 b'david assaraf', b'pascal demolon', b'jean-baptiste iera',
                                 b'richard sammel', b'vincent crouzet', b'fred epaud',
                                 b'pascal elso', b'nicolas giraud', b'michael abiteboul'],
                                [b'irma huntus', b'leena suomu', b'matti turunen',
                                 b'eikka lehtonen', b'esa niemela', b'sirkka metsasaari',
                                 b'tauno lehtihalmes', b'ulla tapaninen', b'toivo tuomainen',
                                 b'hellin auvinen-salmi'],
                                [b'david naughton', b'judie aronson', b'kevin mccarthy',
                                 b'jeff conaway', b'dani minnick', b'ernestine mercer',
                                 b'john carl buechler', b'gary brockette', b'steve lundquist',
                                 b'billy stevenson', b'michael scott-bicknell', b'david coburn',
                                 b'nicole hansen', b'tiffany million', b'robert ruth']]

partial_x_train_reviews = [
    [b'edward small take director alfred e green cast crew uncommonly attractive brilliant assemblage south sea majority curiously undersung piece location far stylize date goldwyn hurricane admittedly riddle cliche formula package visual technical excellence scarcely matter scene stop heart chiseled adonis jon hall porcelain idol frances farmer outline profile s steam background volcano romantic closeup level defies comparison edward small film typically string frame individual work art say outdid do workhorse composer edward ward song score year prior work universal stun phantom opera'],
    [b'jessica biel probably best know virtuous good girl preacher kid mary camden heaven get tackle classic noel coward role early play easy virtue american interloper english aristocratic family unsettle family matriarch kristin scott thomas noel coward write upper class twit pretension wit keep come kind adopt way adopt oscar wilde george bernard shaw kid grow poverty way talent entertain upper class take coward heart felt modern progressive generally term social trend whittakers easy virtue kind aristocrat anybody like hang party invite noel entertain amelia earhart aviation jessica biel character auto race young widow detroit area course area motor car auto race fresh win monte carlo win young ben barnes heir whittaker estates lot land debt barnes bring biel home family mortify classless american way sense recognize class distinction thing get rid title nobility aristocrats story scott thomas dominate family try desperately estate husband colin firth serve world war horror do probably horror trench war slaughter fact class distinction tend melt combat biel kind like wife rule whittaker roost scandal past threatens disrupt barnes biel marriage form crux story turn fact end really viewer figure eventually happen second film adaption easy virtue silent film direct young alfred hitchcock easy virtue actually premier america london star great american stage actress jane cowl guess coward figure american heroine best american theatergoer british one version easy virtue direct flawlessly stephen elliot fine use period music noel coward cole porter end credit really mock upper class coward tradition play going gets tough tough going believe elliott try say class especially one right stuff course obligatory fox hunt upper class indulge oscar wilde say unspeakable uneatable chance younger generation expose noel coward worth see'],
    [b'saw night eurocine event movie european country show day european city hear le bomin barely hear derangere la chambre des officiers fortunately surprise discover great talent unknown large audience derangere absolutely astonish play character antonin verset victim post wwi trauma live trouble scene endure month war cast excellent great work cinematography offer really nice shot great landscape stun face edit really subtile bit memory make sense story minute movie show real chill ww archive action flick like sensitive psychologic movie really think absolutely recommend les fragments d antonin let le bomin'],
    [b'rauni mollberg earth sinful song favorite foreign film establish director major talent film festival circuit get amazing followup milka base work novelist timo mukka till worthy major dvd exposure unlike kaurismaki bros follow double handedly create tongue cheek deadpan finnish film style fan world mollberg commit naturalistic approach film overflow nature life lust earthiness find scandi cinema mainly work famous talent swede vilgot sjoman curious yellow fame director film tabu title imply mollberg effort quite effective sidestep fully treat screen theme incest making adult character father figure real blood relate daddy applies usual merely step father gimmick use countless time american movie incest work matti turunen kristus perkele translate christ devil really common law step dad underage milka beautiful offbeat fashion young girl portray shot irma huntus bring screen sexiness bergman harriet andersson decade earlier create international success summer monika sawdust tinsel imagine actress milka role shame do pursue act career afterward completing strong line leena suomu earth mother type confines act narrow emotional range prove solid rock crucial role bookended spectacularly beautiful shot birch wood winter virtually black white visually color presence milka film quickly develop nature theme presence strange click beak bird talisman early scene milka handyman turunen frolicking naked lake emerge oh natural sex play year old milka man result tastefully shoot intimacy imply ejaculation set trouble come religious aspect remote farm community heavily stress especially enjoy motif spiritual guidance cantor malmstrom quality anti stereotypical play eikka lehtonen instead rigid cruel turn care milka illegitimate baby bear strong romance turunen stud continue service mom woman neighborhood present utterly natural viewer position watch ethnographic exercise moralistic tale powerful technique milka frequently speak directly camera viewer forceful monologue bear crisp sound record sound nature include rain constant motif make milka engross experience view film subtitle knowledge finnish lapp recall best silent era classic direction strong convey dramatic content theme way transcend language kudos mollberg talented cinematographer job work remain obscurity ripe rediscovery'],
    [b'wonder horror film write woody allen wannabe come like check imaginatively direct typical enjoyable haunt place premise solid makeup effect good job major flaw dialogue overload cheeky wisecrack witticisms sample want scary shopping ex wife hit mark deliver inappropriate moment hero battle evil ghost']]

partial_y_train = [[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
                   [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0],
                   [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
                   [0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]]  # multilabel classification



# Using generator for creating the dataset.
def generator():
    for i in range(0, len(partial_y_train)):
        # creates x's and y's for the dataset.
        yield b''.join((partial_x_train_features[i] + partial_x_train_plot[i] + partial_x_train_actors_array[i] +
             partial_x_train_reviews[i])), partial_y_train[i]



dataset = tf.data.Dataset.from_generator(generator, (tf.string, tf.int64),
                                         (tf.TensorShape(None), tf.TensorShape([17])))

dataset = dataset.batch(1)

for i, j in dataset.take(5):
    print(i)
    print(j)

# import the pre-trained model
model = "https://tfhub.dev/google/tf2-preview/gnews-swivel-20dim/1"
hub_layer = hub.KerasLayer(model, output_shape=[20], input_shape=[], dtype=tf.string, trainable=True)

# create the neural network structure
model = tf.keras.Sequential(name="English_Google_News_130GB_witout_OOV_tokens")
model.add(hub_layer)
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(20, activation='relu'))
model.add(tf.keras.layers.Dense(17, activation='softmax'))

# model.name("English Google News 130GB witout OOV tokens")
print(model.summary())

model.compile(optimizer='adam',
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(
    dataset,
    epochs=10,
    batch_size=1)