keras:将字符串的numpy ndarray转换为浮点数的numpy ndarray :: ValueError:无法将字符串转换为float:'Y'

时间:2018-09-15 18:33:06

标签: python-3.x pandas numpy machine-learning keras

我正在尝试使用keras运行基于神经网络的模型 哪个numpy数组可用作训练数据和标签的输入。 数据最初存储在文本文件中,长0/1 序列不是逗号分隔的,我读过 包含65行和7116列的numpy数组。

print('X.shape ', X.shape)
X.shape  (65, 7116)

print('X \n ', X)
X 
 [['0' '1' '0' ... '0' '0' '0']
 ['1' '0' '0' ... '0' '0' '0']
 ['1' '0' '0' ... '0' '0' '0']
 ...
 ['0' '0' '1' ... '0' '0' '0']
 ['1' '0' '0' ... '0' '0' '0']
 ['0' '0' '1' ... '0' '0' '0']]

X是我当前输入的数据     print('type(X)',type(X))     类型(X)

print('type(X[0]) ', type(X[0]))
type(X[0])  <class 'numpy.ndarray'>

print('type(X[0][0]) ', type(X[0][0]))
type(X[0][0])  <class 'numpy.str_'>

现在type(X[0][0])<class 'numpy.str_'>而不是浮动的, 我不能将其用作NN的输入。

我使用了以下方法,但是这些给出了错误     X1 = X.astype(float) # ValueError: could not convert string to float: 'Y'

x1 = np.asarray(X, dtype=float) # ValueError: could not convert string to float: 'Y'

X1 = np.array(X)
np.float_(X1) # ValueError: could not convert string to float: 'Y'

print('X1 ', X1)

如何对其进行转换,以便可以将其用作NN的输入。

输出Y已经被替换:

Y = Y.reshape((65,1))  # Y was in pandas data frame originally
print(type(Y)) # <class 'numpy.ndarray'>
print(Y.shape) # (65, 1)

model = Sequential()
model.add(Dense(4000, input_dim=7116, activation='relu'))
model.add(Dense(1000, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(50, activation='relu'))
model.add(Dense(1, activation='linear'))

model.compile(loss='mean_squared_error', optimizer='adam')

model.fit(X, Y, verbose=2) 

Currently the fit function says:
Epoch 1/10
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-45-344e05f836de> in <module>()
     10 model.compile(loss='mean_squared_error', optimizer='adam')
     11 
---> 12 model.fit(X, Y, verbose=2)

D:\Installed_Programs\anaconda3\lib\site-packages\keras\models.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
    861                               class_weight=class_weight,
    862                               sample_weight=sample_weight,
--> 863                               initial_epoch=initial_epoch)
    864 
    865     def evaluate(self, x, y, batch_size=32, verbose=1,

D:\Installed_Programs\anaconda3\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, **kwargs)
   1428                               val_f=val_f, val_ins=val_ins, shuffle=shuffle,
   1429                               callback_metrics=callback_metrics,
-> 1430                               initial_epoch=initial_epoch)
   1431 
   1432     def evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None):

D:\Installed_Programs\anaconda3\lib\site-packages\keras\engine\training.py in _fit_loop(self, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch)
   1077                 batch_logs['size'] = len(batch_ids)
   1078                 callbacks.on_batch_begin(batch_index, batch_logs)
-> 1079                 outs = f(ins_batch)
   1080                 if not isinstance(outs, list):
   1081                     outs = [outs]

D:\Installed_Programs\anaconda3\lib\site-packages\keras\backend\tensorflow_backend.py in __call__(self, inputs)
   2266         updated = session.run(self.outputs + [self.updates_op],
   2267                               feed_dict=feed_dict,
-> 2268                               **self.session_kwargs)
   2269         return updated[:len(self.outputs)]
   2270 

D:\Installed_Programs\anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata)
    898     try:
    899       result = self._run(None, fetches, feed_dict, options_ptr,
--> 900                          run_metadata_ptr)
    901       if run_metadata:
    902         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

D:\Installed_Programs\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1102             feed_handles[subfeed_t] = subfeed_val
   1103           else:
-> 1104             np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
   1105 
   1106           if (not is_tensor_handle_feed and

D:\Installed_Programs\anaconda3\lib\site-packages\numpy\core\numeric.py in asarray(a, dtype, order)
    490 
    491     """
--> 492     return array(a, dtype, copy=False, order=order)
    493 
    494 

ValueError: could not convert string to float: 'Y'

我检查了numpy数组的各个元素之间是否有空格/ Nan /其他字符。但是没有这种类型的元素可能导致此错误。所以我想这是引起问题的numpy数组的字符串元素。

提前谢谢!!

1 个答案:

答案 0 :(得分:2)

假设Y的值应为'yes'/ 1 ...

使用numpy.where'Y'的值强制转换为1

X = np.where(X=='Y', 1, X)