要在分解分类值后删除具有最低数据量的列

时间:2017-05-30 13:08:23

标签: r rstudio

您好我正在创建一个n-1的矩阵函数。但我想删除n列中数量最少的分类变量。怎么做。

   BreakCad= function(dataf,catog)  # takes the parent data frame and the variable containing categorical variable.

{

  result<- model.matrix(~ factor(catog)-1) ##--Takes input a categorical variable breaks to n-1 categorical variable with value  0 or 1--#

result<-result[,-1] # its removing col in position -1 , I want to remove the #knonw which has lowest quantity

result= as.data.frame(result)

 result=cbind(dataf,result)

  return(result)

}

y= data.frame(Decision=sample(c("yes","no","cant decide"),40,replace=TRUE ),point1=sample(1:10, 40, replace=TRUE))

fd=BreakCad(y,y$Decision)

min(table(y$Decision))

1 个答案:

答案 0 :(得分:0)

您可以使用catog功能将(catog<-factor(rep(letters[1:4], 4:1))) [1] a a a a b b b c c d Levels: a b c d 的选定级别放在第一位。玩具示例:

(wmin<-which.min(table(catog)))
d 
4 

查找最低计数的级别:

(catog<-relevel(catog, wmin))
[1] a a a a b b b c c d
Levels: d a b c

将其设为第一级:

result<-model.matrix(~factor(catog)-1)
result<-result[,-1]

然后你的代码应该可以工作。

doc.css('html').each do |element|
 images = element.css('img.article_img')
 images.each do |node|
    parent = node.parent
    parent.before(node)
 end
end