这就是我要使用RELU,softmax和交叉验证来帮助从x(工作日,upc,scancount,departmentdescription和finelinenumber)预测行程类型(y)的原因。
数据来自Kaggle(https://www.kaggle.com/c/walmart-recruiting-trip-type-classification
>>import requests
>>df = ('Documents/train.csv')
>>DataLabels = ["Trip_Type", "Visit_Number", "Weekday", "UPC", "Scan_Count",
"Department_Description", "Fine_Line_number" ]
>>data = pd.read_csv(df, header=None, names=DataLabels)
>>Weekday_mapping = {
'Monday': (0),
'Tuesday': (1),
'Wednesday': (2),
'Thursday': (3),
'Friday': (4),
'Saturday': (5),
'Sunday': (6)
}
>>data['Weekday'] = data['Weekday'].map(Weekday_mapping)
>>data
>>x=data.Weekday, data.UPC, data.Scan_Count, data.Department_Description,
data.Fine_Line_number
>>y=data.Trip_Type
>>X_train, X_test, y_train, y_test = train_test_split(x, y, test_size = 0.3,
random_state = 0)
ValueError: Found input variables with inconsistent numbers of samples: [5, 647054]