我已经找到了一些答案,但我希望有人可以在这里解释我做错了什么。
import pandas as pd
import numpy as np
import os
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
mlb = MultiLabelBinarizer()
nmdc_df = pd.read_excel('ml.xlsx')
nmdc_df.drop(nmdc_df.columns[0:5],axis=1,inplace=True)
# Clean up database
nmdc_df['190dt375_std190MA'] = nmdc_df['ma_190_d_t_375'] + nmdc_df['std_190_d_t_375']
# nmdc_df.head()
nmdc_df.dropna(how='any', inplace=True)
X = nmdc_df[['d_t_375','190dt375_std190MA']]
y = nmdc_df[['Entry','buyorsell','pl']]
y_enc = mlb.fit_transform(y)
X_train,X_test,y_train,y_test= train_test_split(X, y_enc, test_size=0.3)
model = MLPClassifier(solver='lbfgs', alpha=1e-5,
hidden_layer_sizes=(5, 2), random_state=1)
model.fit(X_train,y_train)
predictions = model.predict(X_test)
score = accuracy_score(y_test,predictions)
print(score)
ValueError跟踪(最近一次通话最近) 在 29 30#y.head() ---> 31 X_train,X_test,y_train,y_test = train_test_split(X,y_enc,> test_size = 0.3) 32#头(5) 33#y_train.head() ValueError:找到输入样本数量不一致的输入变量:> [55,2]
我刚开始学习机器学习,但似乎找不到正确的答案。
X数据帧头:
d_t_375 190dt375_std190MA
0 0.224533 0.143279
1 0.542533 0.095203
2 -0.238400 0.221700
3 0.167467 0.143120
4 -0.138533 0.076678
X.shape[0]
55
len(X)
55
y个数据帧头:
Entry buyorsell pl
0 Y B -0.224533
1 Y B -0.350000
2 Y S 0.950000
3 Y B -0.167467
4 Y S 1.300000
y_enc.shape[0]
2
len(y)
55
TIA