statsmodels引发TypeError:输入类型不支持ufunc'isfinite'

时间:2019-10-19 14:02:01

标签: python machine-learning statsmodels sklearn-pandas

我正在使用statsmodels.api应用向后消除,并且代码给出此错误`TypeError:输入类型不支持ufunc'isfinite',并且根据转换规则,无法将输入安全地强制转换为任何受支持的类型' “安全”

我不知道如何解决

这是代码

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import  train_test_split
from sklearn.preprocessing import  LabelEncoder, OneHotEncoder
from sklearn.compose import  ColumnTransformer
import statsmodels.api as smf

data = pd.read_csv('F:/Py Projects/ML_Dataset/50_Startups.csv')
dataSlice = data.head(10)

#get data column
readX = data.iloc[:,:4].values
readY = data.iloc[:,4].values

#encoding c3
transformer = ColumnTransformer(
    transformers=[("OneHot",OneHotEncoder(),[3])],
    remainder='passthrough' )
readX = transformer.fit_transform(readX.tolist())
readX = readX[:,1:]

trainX, testX, trainY, testY = train_test_split(readX,readY,test_size=0.2,random_state=0)

lreg = LinearRegression()
lreg.fit(trainX, trainY)
predY = lreg.predict(testX)

readX = np.append(arr=np.ones((50,1),dtype=np.int),values=readX,axis=1)

optimisedX = readX[:,[0,1,2,3,4,5]]
ols = smf.OLS(endog=readX, exog=optimisedX).fit()
print(ols.summary())

这是错误消息

Traceback (most recent call last):
  File "F:/Py Projects/ml/BackwardElimination.py", line 33, in <module>
    ols = smf.OLS(endog=readX, exog=optimisedX).fit()
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\regression\linear_model.py", line 838, in __init__
    hasconst=hasconst, **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\regression\linear_model.py", line 684, in __init__
    weights=weights, hasconst=hasconst, **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\regression\linear_model.py", line 196, in __init__
    super(RegressionModel, self).__init__(endog, exog, **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py", line 216, in __init__
    super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py", line 68, in __init__
    **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\model.py", line 91, in _handle_data
    data = handle_data(endog, exog, missing, hasconst, **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\data.py", line 635, in handle_data
    **kwargs)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\data.py", line 80, in __init__
    self._handle_constant(hasconst)
  File "C:\Users\udit\AppData\Local\Programs\Python\Python37\lib\site-packages\statsmodels\base\data.py", line 125, in _handle_constant
    if not np.isfinite(ptp_).all():
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

3 个答案:

答案 0 :(得分:1)

今天我收到了同样的错误。
根本原因是将numpy dtype object转换为float64并为其分配新变量,并在函数中使用此变量。

See example here

答案 1 :(得分:0)

您需要使用numpy将readX的数据类型更改为int或float64。 optimisedX初始化之前的astype()函数。也将endog更改为readY

readX.astype('float64')
optimisedX = readX[:,[0,1,2,3,4,5]]
ols = smf.OLS(endog=readY, exog=optimisedX).fit()
print(ols.summary())

答案 2 :(得分:0)

只需添加此行

X_opt = X[:, [0, 1, 2, 3, 4, 5]] 
X_opt = np.array(X_opt, dtype=float) # <-- this line 

将其转换为数组并更改数据类型。