Tensorflow sklearn中的K折交叉验证错误

时间:2020-07-29 17:55:38

标签: tensorflow keras cross-validation sklearn-pandas

我正在使用以下代码进行语义分割(图像和蒙版),此代码通过简单的培训和测试即可正常工作,但是当我尝试实现k倍交叉验证时。该代码显示错误,请检查我的代码,让我知道出了什么问题,以及如何解决此问题!

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.losses import sparse_categorical_crossentropy
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import KFold
import numpy as np

def tf_dataset(x, y, batch=8):
    dataset = tf.data.Dataset.from_tensor_slices((x, y))
    dataset = dataset.map(tf_parse)
    dataset = dataset.batch(batch)
    dataset = dataset.repeat()
    return dataset

train_dataset = tf_dataset(train_x, train_y, batch=batch_size)
valid_dataset = tf_dataset(valid_x, valid_y, batch=batch_size)

num_folds = 10

# Define per-fold score containers
acc_per_fold = []
loss_per_fold = []

# Define the K-fold Cross Validator
kfold = KFold(n_splits=num_folds, shuffle=True)

# K-fold Cross Validation model evaluation
fold_no = 1

for train, valid in kfold.split(train_dataset, valid_dataset):
  
  optimizer = tf.keras.optimizers.Adam(lr)
  metrics = ['accuracy']
  model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=metrics)

  callbacks = [ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=4),
              EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=False)]

  train_steps = len(train_x) // batch_size
  valid_steps = len(valid_x) // batch_size

  if len(train_x) % batch_size != 0:
    train_steps += 1
  if len(valid_x) % batch_size != 0:
    valid_steps += 1


  # Generate a print
  print('------------------------------------------------------------------------')
  print(f'Training for fold {fold_no} ...')

  model.fit(train_dataset[train], valid_dataset[train],
            epochs=epochs,
            steps_per_epoch=train_steps,
            validation_steps=valid_steps,
            callbacks=callbacks)

  # Generate generalization metrics
  scores = model.evaluate(train_dataset[valid], valid_dataset[valid], verbose=0)
  print(f'Score for fold {fold_no}: {model.metrics_names[0]} of {scores[0]}; {model.metrics_names[1]} of {scores[1]*100}%')
  acc_per_fold.append(scores[1] * 100)
  loss_per_fold.append(scores[0])

  # Increase fold number
  fold_no = fold_no + 1

# == Provide average scores ==
print('------------------------------------------------------------------------')
print('Score per fold')
for i in range(0, len(acc_per_fold)):
  print('------------------------------------------------------------------------')
  print(f'> Fold {i+1} - Loss: {loss_per_fold[i]} - Accuracy: {acc_per_fold[i]}%')
print('------------------------------------------------------------------------')
print('Average scores for all folds:')
print(f'> Accuracy: {np.mean(acc_per_fold)} (+- {np.std(acc_per_fold)})')
print(f'> Loss: {np.mean(loss_per_fold)}')
print('------------------------------------------------------------------------')

错误

--------------------------------------------------- ---------------------------- TypeError跟踪(最近的呼叫 最后)在() 12#K折交叉验证模型评估 13折否= 1 ---> 14代表火车,在kfold.split(train_dataset,valid_dataset)中有效: 15 16优化程序= tf.keras.optimizers.Adam(lr)

4帧 /usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py在 _num_samples(x) 150(如果len(x.shape)== 0: 151提高TypeError(“不能考虑单个数组%r” -> 152“有效集合”。 % X) 153#检查形状是否返回整数或默认为len 154#Dask数据框可能无法返回数字shape [0]值

TypeError:单例数组array(, dtype = object)不能视为有效集合。

0 个答案:

没有答案