提前感谢任何指导。我试图通过Logistic回归使用scikit-learn进行分类,其中X是截距,一个字段是心率数据,称为心率。在研究其他人也遇到此错误的基础上,我确保心率阵列的形状/大小都相同。
它在sklearn / utils / validation.py第382行获取值错误,在check_array中通过array = np.array完成数据框副本的行(array,dtype = dtype,order = order,copy = copy)。我怀疑我的阵列在记忆中是不连续的,这是什么造成了问题,但不确定......
以下是一些代码片段 - 它可以帮助解决这个问题:
def get_training_set(self):
training_set = []
after_date = datetime.utcnow() - timedelta(weeks=8)
before_date = datetime.utcnow() - timedelta(hours=12)
activities = self.strava_client.get_activities(after=after_date, before=before_date)
for act in activities:
if act.has_heartrate:
streams = self.strava_client.get_activity_streams(activity_id=act.id, types=['heartrate'])
heartrate = np.array(list(filter(lambda x: x is not None, streams['heartrate'].data)))
fixed_heartrate = np.pad(heartrate, (0, 15000 - len(heartrate)), 'constant')
item = {'activity_type': self.classes.index(act.type),'heartrate': fixed_heartrate}
training_set.append(item)
return pd.DataFrame(training_set)
def train(self):
df = self.get_training_set()
df['Intercept'] = np.ones((len(df),))
y = df[['activity_type']]
X = df[['Intercept', 'heartrate']]
y = np.ravel(y)
#
model = LogisticRegression()
self.debug('y={}'.format(y))
model = model.fit(X,y)
适合发生异常......
提前感谢任何指导。
尊重,
麦克
从评论中复制以改进格式:
/python3.5/site-packages/sklearn/linear_model/logistic.py", line 1173, in
fit order="C")
File "/python3.5/site-packages/sklearn/utils/validation.py", line 521, in
check_X_y ensure_min_features, warn_on_dtype, estimator)
File "/lib/python3.5/site-packages/sklearn/utils/validation.py", line 382, in
check_array array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: setting an array element with a sequence
和另一条评论:
X和y看起来像这样:
X.shape=(29, 2)
y.shape=(29,)
X=[[1 array([74, 74, 77, ..., 0, 0, 0])]
[1 array([66, 67, 69, ..., 0, 0, 0])]
...
[1 array([92, 92, 91, ..., 0, 0, 0])]
[1 array([79, 79, 79, ..., 0, 0, 0])]]
y=[ 0 11 11 0 1 0 11 0 11 1 0 11 0 0 11 0 0 0 0 0 11 0 11 0 0 0 11 0 0]
答案 0 :(得分:0)
如果改变train(),事情会更好吗?看起来像这样吗?
{"dados": [["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"], ["id", "Nome", "Sigla", "Cidades"]], "erro": null}
(a)将生成正确长度的序列
(b)使用值返回numpy数组而不是另一个数据帧
(c)适合在现场进行