当我使用FOR循环运行迭代来构建多个模型时,我遇到了上述错误。前两个具有相似数据集的模型构建良好。在构建第三个模型时,我收到此错误。抛出错误的代码是当我使用python的Statsmodel包调用sm.logit()时:
y = y_mort.convert_objects(convert_numeric=True)
#Building Logistic model_LSVC
print("Shape of y:", y.shape, " &&Shape of X_selected_lsvc:", X.shape)
print("y values:",y.head())
logit = sm.Logit(y,X,missing='drop')
出现的错误:
Shape of y: (9018,) &&Shape of X_selected_lsvc: (9018, 59)
y values: 0 0
1 1
2 0
3 0
4 0
Name: mort, dtype: int64
ValueError Traceback (most recent call last)
<ipython-input-8-fec746e2ee99> in <module>()
160 print("Shape of y:", y.shape, " &&Shape of X_selected_lsvc:", X.shape)
161 print("y values:",y.head())
--> 162 logit = sm.Logit(y,X,missing='drop')
163 # fit the model
164 est = logit.fit(method='cg')
D:\Anaconda3\lib\site-packages\statsmodels\discrete\discrete_model.py in __init__(self, endog, exog, **kwargs)
399
400 def __init__(self, endog, exog, **kwargs):
--> 401 super(BinaryModel, self).__init__(endog, exog, **kwargs)
402 if (self.__class__.__name__ != 'MNLogit' and
403 not np.all((self.endog >= 0) & (self.endog <= 1))):
D:\Anaconda3\lib\site-packages\statsmodels\discrete\discrete_model.py in __init__(self, endog, exog, **kwargs)
152 """
153 def __init__(self, endog, exog, **kwargs):
--> 154 super(DiscreteModel, self).__init__(endog, exog, **kwargs)
155 self.raise_on_perfect_prediction = True
156
D:\Anaconda3\lib\site-packages\statsmodels\base\model.py in __init__(self, endog, exog, **kwargs)
184
185 def __init__(self, endog, exog=None, **kwargs):
--> 186 super(LikelihoodModel, self).__init__(endog, exog, **kwargs)
187 self.initialize()
188
D:\Anaconda3\lib\site-packages\statsmodels\base\model.py in __init__(self, endog, exog, **kwargs)
58 hasconst = kwargs.pop('hasconst', None)
59 self.data = self._handle_data(endog, exog, missing, hasconst,
---> 60 **kwargs)
61 self.k_constant = self.data.k_constant
62 self.exog = self.data.exog
D:\Anaconda3\lib\site-packages\statsmodels\base\model.py in _handle_data(self, endog, exog, missing, hasconst, **kwargs)
82
83 def _handle_data(self, endog, exog, missing, hasconst, **kwargs):
---> 84 data = handle_data(endog, exog, missing, hasconst, **kwargs)
85 # kwargs arrays could have changed, easier to just attach here
86 for key in kwargs:
D:\Anaconda3\lib\site-packages\statsmodels\base\data.py in handle_data(endog, exog, missing, hasconst, **kwargs)
564 klass = handle_data_class_factory(endog, exog)
565 return klass(endog, exog=exog, missing=missing, hasconst=hasconst,
--> 566 **kwargs)
D:\Anaconda3\lib\site-packages\statsmodels\base\data.py in __init__(self, endog, exog, missing, hasconst, **kwargs)
74 # this has side-effects, attaches k_constant and const_idx
75 self._handle_constant(hasconst)
---> 76 self._check_integrity()
77 self._cache = resettable_cache()
78
D:\Anaconda3\lib\site-packages\statsmodels\base\data.py in _check_integrity(self)
450 (hasattr(endog, 'index') and hasattr(exog, 'index')) and
451 not self.orig_endog.index.equals(self.orig_exog.index)):
--> 452 raise ValueError("The indices for endog and exog are not aligned")
453 super(PandasData, self)._check_integrity()
454
ValueError: The indices for endog and exog are not aligned
y矩阵和X矩阵具有(9018,),(9018,59)的形状。因此,不会出现依赖变量和自变量的任何不匹配。有什么想法吗?
答案 0 :(得分:6)
尝试将 y 转换为 sm.Logit()行之前的列表。
y = list(y)
答案 1 :(得分:3)
该错误消息表明您拥有不同形状的Endog和Exog。 这是python中的常见错误,可以通过在因变量上使用'reshape'函数使其与自变量的形状对齐来轻松解决。
.horizontal {
-moz-transform: scaleX(-1);
-o-transform: scaleX(-1);
-webkit-transform: scaleX(-1);
transform: scaleX(-1);
filter: FlipH;
-ms-filter: "FlipH";
}
以上各行表示:- 我们提供的列为1,但行的数目为未知,即,单列的行数与X相同。
让我们举个例子:-
y_train.values.reshape(-1,1)
现在,我们将在此数组上使用reshape(-1,1)函数。我们可以看到新数组有4行和1列。
z = np.array([[1, 2], [ 3, 4]])
print(z.shape) # (2, 2)
答案 2 :(得分:1)
这个错误也可能是由于API的错误使用造成的
正确:
import React, { useState } from 'react';
import Dialog from '@material-ui/core/Dialog';
import DialogActions from '@material-ui/core/DialogActions';
import DialogContent from '@material-ui/core/DialogContent';
import DialogTitle from '@material-ui/core/DialogTitle';
import Button from '@material-ui/core/Button';
const AddEvent = (props) => {
const [open, setOpen] = useState(false);
const [event, setEvent] = useState({
id: '',
title: '',
subTitle: '',
startDate: '',
displayUntilDate: '',
location: '',
description: '',
infoLink: ''
});
const handleClickOpen = () => {
setOpen(true);
};
const handleClose = () => {
setOpen(false);
};
const handleChange = (event) => {
setEvent({ ...event, [event.target.name]: event.target.value });
}
const handleSave = () => {
props.addEvent(event); {/*<-- Line 34 for Add Event*/}
handleClose();
}
return (
<div>
<br />
<button class="" variant="outlined" color="primary" onClick={handleClickOpen}
>Add New Event
</button>
<Dialog open={open} onClose={handleClose}>
<DialogTitle>New Event</DialogTitle>
<DialogContent>
<input type="text" placeholder="Id" name="id"
value={event.id} onChange={handleChange} /><br />
<input placeholder="Title" name="title"
value={event.title} onChange={handleChange} /><br />
<input type="text" placeholder="Sub Title" name="subTitle"
value={event.subTitle} onChange={handleChange} /><br />
<input type="date" placeholder="Start Date" name="startDate"
value={event.startDate} onChange={handleChange} /><br />
<input type="date" placeholder="Display Until Date" name="displayUntilDate"
value={event.displayUntilDate} onChange={handleChange} /><br />
<input type="text" placeholder="Location" name="location"
value={event.location} onChange={handleChange} /><br />
<input type="text" placeholder="Description" name="description"
value={event.description} onChange={handleChange} /><br />
<input type="text" placeholder="Info Link" name="infoLink"
value={event.infoLink} onChange={handleChange} /><br />
</DialogContent>
<DialogActions>
<button onClick={handleClose}>Cancel</button>
<button onClick={handleSave}>Save</button>
</DialogActions>
</Dialog>
</div>
);
};
export default AddEvent;
```
不正确:
X_train, X_test, y_train, y_test = train_test_split(
X, y, train_size=0.7, test_size=0.3, random_state=100
)
答案 3 :(得分:0)
您是否检查过数据中是否有Nan
?您可以使用np.isNan(X)
和np.isNan(y)
。我看到你打开了选项drop
,所以我怀疑你的数据中是否有Nan
,这会改变输入的形状。
答案 4 :(得分:0)
可能是由于x
和y
中的索引不同。当我们最初从数据帧中删除一些值并在分离x
和x
之后对y
执行某些操作时,可能会发生这种情况。 y
中的索引将包含原始数据帧中缺少的索引,而x
中的索引将具有连续的索引。最好在分离dataframe.reset_index(drop = True)
和x
之前先进行y
。
答案 5 :(得分:0)
做y_train.values.ravel()
。
实际上 y_train 的形状是二维数组。
所以你需要把它转换成一维数组。
希望它对你有用。