我想使用逻辑回归从Excel数据集中预测和绘制curve,并获取其斜率系数。但是,当我运行代码(见下文)时,会出现错误“ ValueError:未知标签类型:'continuous'。”。
我在类似的问题中读到,y值应为'int'类型,但我不想对其进行转换,因为y值在1.66和0.44之间...
是否有针对此类情况的解决方案,还是应该尝试其他回归模型?
非常感谢
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import PolynomialFeatures
import seaborn as sns
from sklearn.linear_model import LogisticRegression
df = pd.read_excel('Fatigue2.xlsx',sheet_name='Sheet4')
X = df[['Strain1', 'Temperature1']]
y = df['Cycles1']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=101)
#poly = PolynomialFeatures(degree=2)
#X_ = poly.fit_transform(X_train)
LR = LogisticRegression()
LR.fit(X_train,y_train)
g = sns.lmplot(x='Cycles1', y='Strain1', hue = 'Temperature1', data=df, fit_reg= False)
g.set(xscale='log', yscale ='log')
g.set_axis_labels("Cycles (log N)", "Strain")
print ('Coefficients : ', LR.coef_, 'Intercept :', LR.intercept_)
关于数据,我在Excel工作表中总共有97个值:
Cycles1 Strain1 Temperature1
27631 1.66 650
... ... 650
6496220 0.44 650
答案 0 :(得分:0)
LogisticRegression
中的 sklearn
是一个分类器,即它期望响应变量是分类的。
您的任务是回归。此外,该图似乎没有右侧logit的渐近行为。如here所述,使用多项式回归可能会获得更好的结果。
答案 1 :(得分:0)
def type_of_target(y): “”“确定目标指示的数据类型。
Note that this type is the most specific type that can be inferred.
For example:
* ``binary`` is more specific but compatible with ``multiclass``.
* ``multiclass`` of integers is more specific but compatible with
``continuous``.
* ``multilabel-indicator`` is more specific but compatible with
``multiclass-multioutput``.
Parameters
----------
y : array-like
Returns
-------
target_type : string
One of:
* 'continuous': `y` is an array-like of floats that are not all
integers, and is 1d or a column vector.