这是我在运行混淆矩阵时得到的结果:
TRUE
0 47
1 231194
想要这样的事情:
0 1
0 1000 47
1 50 3000
我不确定我在这里做错了什么以及为什么我在运行混淆矩阵时会收到这种类型的响应。请帮忙。我尝试重新分类变量,看看是否有帮助。我也想知道我选择的变量是否不适合该问题并导致此问题。
Traffic_Reduced <- Traffic_Reduced[,c("Date.Of.Stop","Fatal","Time.Of.Stop","SubAgency","Belts","Commercial.License","HAZMAT",
"Commercial.Vehicle","Alcohol","Work.Zone","State","VehicleType",
"Make", "Color","Violation.Type", "Race","Gender","Driver.City",
"Driver.State","DL.State")]
Traffic_Reduced$Time.Of.Stop <- as.numeric(Traffic_Reduced$Time.Of.Stop)
Traffic_Reduced$Time.Of.Stop = ifelse(is.na(Traffic_Reduced$Time.Of.Stop),
ave(Traffic_Reduced$Time.Of.Stop, FUN = function(x) mean(x, na.rm = TRUE)),
Traffic_Reduced$Time.Of.Stop)
# Classification - Logistic Regression
# Fatal (Y/N) and Time.of.Stop
classification <- Traffic_Reduced[,c("Date.Of.Stop","Time.Of.Stop","Fatal")]
classification$Time.Of.Stop <- as.numeric(classification$Time.Of.Stop)
classification$Date.Of.Stop <- as.numeric(classification$Date.Of.Stop)
classification$Fatal = factor(classification$Fatal,
labels = c(0,1)
)
set.seed(100)
split = sample.split(classification$Fatal, SplitRatio = 0.7)
training_set = subset(classification, split == TRUE)
test_set = subset(classification, split == FALSE)
training_set[-3] = scale(training_set[-3])
test_set[-3] = scale(test_set[-3])
classifier = glm(formula = Fatal ~ .,
family = binomial(link="logit"),
data = training_set)
predicted <- plogis(predict(classifier, test_set))
predicted <- predict(classifier, test_set, type = "response")
y_pred = ifelse(predicted > 0.5, 1, 0)
prob_pred = predict(classifier, type = 'response', newdata = test_set[-3])
y_pred = ifelse(prob_pred > 0.5, 1, 0)
cm = table(test_set[, 3], y_pred > 0.5)
Traffic_Reduced数据集示例:
Date.Of.Stop Fatal Time.Of.Stop SubAgency Belts Commercial.License
1 09/24/2013 No 17:11:00 3rd district, Silver Spring No No
2 08/29/2017 No 10:19:00 2nd district, Bethesda No No
3 12/01/2014 No 12:52:00 6th district, Gaithersburg / Montgomery Village No No
4 08/29/2017 No 09:22:00 3rd district, Silver Spring No No
5 08/28/2017 No 23:41:00 6th district, Gaithersburg / Montgomery Village No No
6 08/27/2013 No 00:55:00 2nd district, Bethesda No No
HAZMAT Commercial.Vehicle Alcohol Work.Zone State VehicleType Make Color Violation.Type
1 No No No No MD 02 - Automobile FORD BLACK Citation
2 No No No No VA 02 - Automobile TOYOTA GREEN Citation
3 No No No No MD 02 - Automobile HONDA SILVER Citation
4 No No No No MD 02 - Automobile DODG WHITE Citation
5 No No No No MD 02 - Automobile MINI COOPER WHITE Citation
6 No No No No MD 02 - Automobile HYUNDAI GRAY Citation
Race Gender Driver.City Driver.State DL.State
1 BLACK M TAKOMA PARK MD MD
2 WHITE F FAIRFAX STATION VA VA
3 BLACK F UPPER MARLBORO MD MD
4 BLACK M FORT WASHINGTON MD MD
5 WHITE M GAITHERSBURG MD MD
6 WHITE F SILVER SPRING MD MD
分类数据集示例 - 我将时间戳/日期转换为数值:
Date.Of.Stop Time.Of.Stop Fatal
1 1582 1032 0
2 1447 620 0
3 1923 773 0
4 1447 563 0
5 1441 1422 0
6 1431 56 0