我在stackoverflow上查找了此错误,并找到了几篇文章,但是没有人解决这种特定情况。
我有以下数据框:
输入变量和输出变量在以下代码中定义:
xcol=["h","o","p","d","ddlt","devdlt","sl","lt"]
ycol=["Q","r"]
x=df[xcol].values
y=df[ycol].values
我的目标是根据输入(x)猜测输出值Q&r。 我尝试了两种方法,但都失败了。对于第一个,我尝试了多输出回归器。
我首先将数据分为测试和培训数据:
import numpy as np
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.2)
y_train = y_train.ravel()
y_test = y_test.ravel()
然后导入函数:
from sklearn.multioutput import MultiOutputRegressor
然后尝试预测Q&r:
reg= MultiOutputRegressor(estimator=100, n_jobs=None)
reg=reg.predict(X_train, y_train)
这给了我错误:
TypeError: predict() takes 2 positional arguments but 3 were given
我在做什么错,我该如何解决?
接下来我尝试的是神经网络。在分配了x和y列之后,我创建了神经网络:
# neural network class definition
class neuralNetwork:
#Step 1:
def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
#set number of nodes in each input, hidden, output layer
self.inodes = inputnodes
self.hnodes = hiddennodes
self.onodes = outputnodes
#link weight matrices, wih and who (weights in hidden en output layers),
# we are going to create matrices for the multiplication of it to get an
# output
# weights inside the arrays (matrices) are w_i_j, where link is from node
# i to node j in the next layer
#w11 w21
#w12 w22 etc
self.wih = numpy.random.normal(0.0,pow(self.inodes,-0.5),( self.hnodes,
self.inodes))
self.who = numpy.random.normal(0.0,pow(self.hnodes,-0.5),( self.onodes,
self.hnodes))
# setting the learning rate
self.lr = learningrate
# activation function is the sigmoid function
self.activation_function = lambda x: scipy.special.expit(x)
pass
#Step 2:
def train(self, inputs_list, targets_list):
#convert input lists to 2d array (matrice)
inputs = numpy.array(inputs_list, ndmin=2).T
targets = numpy.array(targets_list, ndmin=2).T
#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih, inputs)
#calculate signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who, hidden_outputs)
#calculate signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)
# output layer error is the (target-actual)
output_errors = targets -final_outputs
#hidden layer error is the output_errors, split by weights, recombined
at hidden nodes
hidden_errors = numpy.dot(self.who.T, output_errors)
#update the weights for the links between the hidden and output layers
self.who += self.lr * numpy.dot((output_errors*final_outputs * (1.0-
final_outputs)),numpy.transpose(hidden_outputs))
# update the weights for the links between the input and hidden layers
self.wih += self.lr*numpy.dot((hidden_errors*hidden_outputs*(1.0-
hidden_outputs)),numpy.transpose(inputs))
pass
#Step 3
def query(self, inputs_list):
#convert input lists to 2d array (matrice)
inputs = numpy.array(inputs_list, ndmin=2).T
#calculate signals into hidden layer
hidden_inputs = numpy.dot(self.wih, inputs)
#calculate signals emerging from hidden layer
hidden_outputs = self.activation_function(hidden_inputs)
#calculate signals into final output layer
final_inputs = numpy.dot(self.who, hidden_outputs)
#calculate signals emerging from final output layer
final_outputs = self.activation_function(final_inputs)
return final_outputs
然后我创建了一个神经网络的实例:
#Creating instance of neural network
#number of input, hidden and output nodes
input_nodes = 8
hidden_nodes = 100
output_nodes = 2
#learning rate is 0.8
learning_rate = 0.8
#create instance of neural network
n = neuralNetwork(input_nodes, hidden_nodes, output_nodes, learning_rate)
我有8个输入和2个输出需要预测。
然后我训练了神经网络:
# train the neural network
# go through all records in the training data set
for record in df:
#scale and shift te inputs
inputs = x
#create the target output values
targets = y
n.train(inputs, targets)
pass
然后我想查询猜测的输出,现在它出错了:
所以我想用Q(Q *)和r(r *)的猜测在数据帧中增加2列:
df["Q*","r*"] = n.query(x)
我真的不知道该怎么做。上面的代码给了我错误:
ValueError: Length of values does not match length of index
任何帮助表示赞赏。
史蒂芬
答案 0 :(得分:2)
关于问题的第一部分(MultiOutputRegressor
),您的代码存在多个问题...
首先,estimator
的{{1}}参数不应为数字,而应像docs所说:
estimator: estimator对象
实现拟合和预测的估计器对象。
因此,对于example,要使用具有默认参数的随机森林,应使用
MultiOutputRegressor
(有关更多示例,请参见this answer)
第二,在您的代码中,您永远不会适合您的回归器;您应该添加
reg = MultiOutputRegressor(RandomForestRegressor())
在定义之后。
第三,reg.fit(X_train, y_train)
不会将基本真值(此处的predict
)作为参数,而只会将要素(y_train
)作为参数;再次从docs开始:
预测(X)
使用模型预测多输出变量 为每个目标变量进行训练。
参数:X:(稀疏)类似数组,形状(n_samples,n_features)
数据。
返回:y:(稀疏)类似数组的形状(n_samples,n_outputs)
跨多个预测变量预测的多个输出目标。注意:为每个预测变量生成单独的模型。
由于您还在代码中传递了X_train
,因此您会收到一个预期的错误,即一个参数过多。只需将其更改为y_train
,就可以了。