将特征向量传递给sklearn的Logistic回归函数

时间:2016-05-12 18:47:10

标签: python numpy machine-learning scikit-learn

我正在使用LogisticRegression来模拟kaggle.com的泰坦尼克号问题。 我想使用Age,Sex等多个变量来模拟我的sigmoid函数。 如果只使用像Sex这样的1个变量,同样的方法可以正常工作但是当用于多个变量时它会抛出以下错误

  

TypeError:float()参数必须是字符串或数字,而不是'method'

我的猜测是我没有正确使用重塑方法。 PS:我是python和sklearn库的初学者。请放轻松我。

import pandas as pd
from sklearn.linear_model import LogisticRegression
import numpy as np


df = pd.read_csv(r'C:\Users\abhi\Downloads\train.csv')

df.Age = df.Age.fillna(df.Age.mean)
df.Embarked = df.Embarked.fillna(df.Embarked.median)
x1 = df.Pclass
x2 = df.Sex
for i in range(len(x2)):
    if x2[i]=='male':
        x2[i]=1
    else: 
        x2[i]=0
#female,male 0,1

x3 = df.Age
x4 = df.SibSp
x5 = df.Parch
x6 = df.Ticket
x7 = df.Fare
x9 = df.Embarked
for i in range(len(x9)):
  if x9[i]=='C':
      x9[i]=0
  elif x9[i]=='Q': 
      x9[i]=1
  else :x9[i]=2

# C,Q,S = 0,1,2
# Creating a feature vector of multiple vectors

i2 = pd.DataFrame()
i2['Pclass'] = x1
i2['Sex'] = x2
i2['Age'] = x3
i2['SibSp'] = x4
i2['Parch'] = x5
i2['Fare'] = x7
i2['Embarked'] = x9
i2 = np.array(i2)
i2 = i2.reshape(-1,1)

ytrain = df.Survived
ytrain = np.array(ytrain)
ytrain = ytrain.reshape(-1,1)
c1 = LogisticRegression(penalty='l2',solver='liblinear')
c1.fit(i2,ytrain,sample_weight=None)
c1.score(i2,ytrain,sample_weight=None)

1 个答案:

答案 0 :(得分:0)

你可以删除这一行来运行你的代码吗? { db.collection('Documents').insertOne({ Employeeid: 1, Employee_Name: "Petro"}) .then(function(db1) { db1.collection('Documents').insertOne({ Employeeid: 2, Employee_Name: "Petra"}) }).then(function(db2) { db.close(); }) });

i2 = i2.reshape(-1,1)重塑为i2会将(-1,1)转换为一维数组,其长度为i2中元素的总长度。这可能不是你想要做的。