我如何在R中应用逻辑回归

时间:2019-02-01 10:10:27

标签: r dataframe logistic-regression

我的数据集很小,我想对它应用逻辑回归以预测myData $ Meeting

我正在粘贴data.frame对象的dput输出

myData <- structure(list(Item.Name = structure(c(1L, 14L, 2L, 12L, 2L, 
11L), .Label = c("brinjal", "chocolate", "cold drink", "injections", 
"jeans", "onion", "potato", "shirts", "skirts", "suit", "syrup", 
"tablet", "tee", "wafer"), class = "factor"), Item.Group.Name = 
 structure(c(4L, 
 2L, 2L, 3L, 2L, 3L), .Label = c("apparel", "food", "medicine", 
"vegetable"), class = "factor"), Quantity = c(44L, 97L, 53L, 
11L, 5L, 71L), Sales.Employee.Name = structure(c(14L, 10L, 8L, 
10L, 5L, 10L), .Label = c("Alysa Fontell", "Breanne Kissock", 
"Clovis Mance", "Eadie Tidcomb", "Ella Tregidga", "Georg Hollyard", 
"Gib Hanalan", "Jade Postle", "Jewelle Woodall", "Kent Franciottoi", 
"Mychal Elix", "Ralina Wraight", "Shaughn Avrahamian", "Sibelle Santino", 
"Sigfrid Alejandro"), class = "factor"), Sales.Employee.Manager = 
structure(c(1L, 
1L, 1L, 1L, 1L, 1L), .Label = "Hanny Stokey", class = "factor"), 
Sales.Employee.Region = structure(c(2L, 5L, 4L, 5L, 4L, 5L
), .Label = c("America/Chicago", "America/Denver", "America/Kentucky/Louisville", 
"America/Los_Angeles", "America/New_York"), class = "factor"), 
Sales.Enquiry.Stage = structure(c(6L, 3L, 3L, 6L, 4L, 5L), .Label = c("Lead", 
"Lost", "Meeting", "Proposal", "Qualified", "Won"), class = "factor"), 
Final.Status = structure(c(1L, 1L, 1L, 1L, 2L, 2L), .Label = c("Closed", 
"Open"), class = "factor"), Enquiry.Source.Sub.Type = structure(c(2L, 
3L, 4L, 3L, 1L, 2L), .Label = c("Existing", "IB Call", "OB Call", 
"Reference", "Website"), class = "factor"), Enquiry.Source.Type = structure(c(1L, 
2L, 2L, 2L, 1L, 1L), .Label = c("Inbound", "Outbound"), class = "factor"), 
Rate.per.Quantity = c(90L, 130L, 400L, 120L, 400L, 150L), 
Estimate.Value = c(3960L, 12610L, 21200L, 1320L, 2000L, 10650L
), Employee.Gender = structure(c(2L, 1L, 2L, 2L, 1L, 2L), .Label = c("Female", 
"Male"), class = "factor"), Meeting = structure(c(2L, 2L, 
2L, 2L, 2L, NA), .Label = c("No", "Yes"), class = "factor")), row.names = c(NA, 
6L), class = "data.frame")      

我运行这段代码时

glm(data = meetingData, formula = meetingData$Meeting ~. , family = binomial(link = "logit"))

我收到此错误,

Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
contrasts can be applied only to factors with 2 or more levels

任何帮助都会非常感激和赞赏。

2 个答案:

答案 0 :(得分:1)

> summary(myData$Meeting)
#>   No  Yes NA's 
#>    0    5    1 

您要预测的列仅具有两个类之一中的值。这使得不可能训练逻辑回归。

答案 1 :(得分:0)

此外,您的Sales.Employee.Manager仅是一个因素(Hanny Stokey)。因为它是一个常数,没有方差,所以它对回归没有任何作用,因此,如果删除它,该错误将不再出现

myData$Sales.Employee.Manager<-NULL