我数据框中的一个属性具有连续的数据类型(aggregatedInocme),我想基于(aggregatedInocme)属性中的值创建一个具有(低,中,高)类别的新属性。我将分类分为三个范围,如下面的代码
所示我使用for循环编写了一个简单的代码,如果该属性中每个单元格的值属于特定范围,则如果声明为chaeck,则将对应的字符串分配给它
y<-min(data_loanapp$aggregatedInocme)-0
x<-max(data_loanapp$aggregatedInocme)-min(data_loanapp$aggregatedInocme)
c1<-(y+(x/3))
c2<- (y+((2*x)/3))
rr <- c()
for (val in data_loanapp$aggregatedInocme){
if(val<= c1) {
rr[val]<- append(rr[val], 'Low')
}else if (c1< val<= c2){
rr[val]<-append(rr[val], "mid")
}else
rr[val]<-append(rr[val], "high")
}
rr
我希望有一个属性(值(低,高,中))。但是我不断获得所有不适用和错误的属性 警告信息: 在rr [val] <-append(rr [val],“ high”)中: 要替换的项目数不是替换长度的倍数
} 错误:“}”中意外的“}”
答案 0 :(得分:0)
我知道了:
#this was used only to find the bins width
library(classInt)
classIntervals(data_loanapp$aggregatedInocme, 3)
data_loanapp$Cat_AggInc<- classIntervals(data_loanapp$aggregatedInocme, 3,
style
= 'equal')
#here i defined and created the categores
data_loanapp$Income_Cat<-c( "low", "medium", "high")[
findInterval(data_loanapp$aggregatedInocme, c(1442,4583, 6588, 81000))]
data_loanapp$Income_Cat