如何使用Keras创建从表格数据进行训练的神经网络?

时间:2018-12-28 04:00:28

标签: python tensorflow keras neural-network

我对Python比较陌生,因此请原谅。我的表格数据如下:

Type,Name,Age,Breed1,Breed2,Gender,Color1,Color2,Color3,MaturitySize,FurLength,Vaccinated,Dewormed,Sterilized,Health,Quantity,Fee,State,RescuerID,VideoAmt,Description,PetID,PhotoAmt,AdoptionSpeed
2,Nibble,3,299,0,1,1,7,0,1,1,2,2,2,1,1,100,41326,8480853f516546f6cf33aa88cd76c379,0,Nibble is a 3+ month old ball of cuteness. He is energetic and playful. I rescued a couple of cats a few months ago but could not get them neutered in time as the clinic was fully scheduled. The result was this little kitty. I do not have enough space and funds to care for more cats in my household. Looking for responsible people to take over Nibble's care.,86e1089a3,1.0,2
2,No Name Yet,1,265,0,1,1,2,0,2,2,3,3,3,1,1,0,41401,3082c7125d8fb66f7dd4bff4192c8b14,0,I just found it alone yesterday near my apartment. It was shaking so I had to bring it home to provide temporary care.,6296e909a,2.0,0
1,Brisco,1,307,0,1,2,7,0,2,2,1,1,2,1,1,0,41326,fa90fa5b1ee11c86938398b60abc32cb,0,"Their pregnant mother was dumped by her irresponsible owner at the roadside near some shops in Subang Jaya. Gave birth to them at the roadside. They are all healthy and adorable puppies. Already dewormed, vaccinated and ready to go to a home. No tying or caging for long hours as guard dogs. However, it is acceptable to cage or tie for precautionary purposes. Interested to adopt pls call me.",3422e4906,7.0,3
1,Miko,4,307,0,2,1,2,0,2,1,1,1,2,1,1,150,41401,9238e4f44c71a75282e62f7136c6b240,0,"Good guard dog, very alert, active, obedience waiting for her good master, plz call or sms for more details if you really get interested, thanks!!",5842f1ff5,8.0,2

我有大量的训练数据,我想创建一个神经网络来预测最后一个值AdoptionSpeed

到目前为止,这是我使用的keras

from keras.models import Sequential
from keras.layers import Dense
import numpy

dataset = numpy.loadtxt("data/train.csv", delimiter=",")

X = dataset[:,0:8]
Y = dataset[:,8]

但是我遇到一个错误:

ValueError: could not convert string to float: Type

我在做什么错了?

2 个答案:

答案 0 :(得分:1)

numpy loadtxt的默认dtype200 OK。而是使用:

float

答案 1 :(得分:1)

如果您查看传递给csv函数的np.loadtxt文件,则某些列没有float类型的数据。很少有列是字符串。因此,在加载csv文件时,np.loadtxt函数的默认转换数据类型为float。这就是错误的原因。克服的最佳方法是对文件使用python readlines函数并遍历所有行。

通常,神经网络期望输入为数字形式。要将string转换为float的值,可以使用类似word2vecTf-Idf及其嵌入替代方法。

要预测Adoptionspeed,可以将此问题视为回归问题。