我试图将逻辑回归或任何其他ML算法应用于这个简单的数据集但是我失败了很多并且遇到了很多错误。我是tr
dim(data)
[1] 11580 12
head(data)
ReturnJan ReturnFeb ReturnMar ReturnApr ReturnMay ReturnJune
1 0.08067797 0.06625000 0.03294118 0.18309859 0.130333952 -0.01764234
2 -0.01067989 0.10211539 0.14549595 -0.08442804 -0.327300392 -0.35926605
3 0.04774193 0.03598972 0.03970223 -0.16235294 -0.147426982 0.04858934
4 -0.07404022 -0.04816956 0.01821862 -0.02467917 -0.006036217 -0.02530364
5 -0.03104575 -0.21267723 0.09147609 0.18933823 -0.153846154 -0.10611511
6 0.57980016 0.33225225 -0.40546095 -0.06000000 0.060732113 -0.21536106
PositiveDec
0
0
0
1
1
1
这是我的尝试
new.data <- data[,-12] #Remove labels' column
index <- sample(1:nrow(new.data), size = 0.8*nrow(new.data))#Split data
train.data <- new.data[index,]
test.data <- new.data[-index,]
fit.glm <- glm(data[,12]~.,data = data, family = "binomial")
答案 0 :(得分:0)
你到了那里,但有几个语法错误,正如评论中指出的那样,需要保留你的结果变量。这应该有效:
# pd.Series([x.date() for x in pd.to_datetime(df['created_at'])])
Counter({datetime.date(2017, 10, 9): 8165,
datetime.date(2017, 10, 10): 5898,
datetime.date(2017, 10, 11): 3104,
datetime.date(2017, 10, 12): 2067,
datetime.date(2017, 10, 13): 1647,
datetime.date(2017, 10, 14): 2750,
datetime.date(2017, 10, 15): 2778,
datetime.date(2017, 10, 16): 3575})