R使用分组数据的级别创建变量

时间:2016-09-29 08:37:35

标签: r dplyr recode

我有一个数据框,例如data

data = data.frame(ID = as.factor(c("A", "A", "B","B","C","C")),
                  var.color= as.factor(c("red", "blue", "green", "red", "green", "yellow")))

我想知道是否可以在ID(例如ABC)中获取每个组的级别,并创建一个粘贴它们的变量。我尝试通过运行以下命令来实现此目的:

data  %>% group_by(ID) %>%
  mutate(ex = paste(droplevels(var.color), sep = "_"))

产量:

Source: local data frame [6 x 3]
Groups: ID [3]

      ID var.color     ex
  <fctr>    <fctr>  <chr>
1      A       red    red
2      A      blue   blue
3      B       green   red
4      B       red    red
5      C     green  green
6      C    yellow yellow

但是,我想要的data.frame应该是这样的:

ID var.color     ex
  <fctr>    <fctr>  <chr>
1      A       red    red_blue
2      A      blue    red_blue
3      B       green    green_red
4      B       red    green_red
5      C     green    green_yellow
6      C    yellow    green_yellow

2 个答案:

答案 0 :(得分:1)

基本上,您需要collapse而不是sep

您只需将ID

分组的文本粘贴在一起,而不是删除级别
library(dplyr)
data  %>% group_by(ID) %>%
         mutate(ex = paste(var.color, collapse = "_"))

#     ID     var.color     ex
#    <fctr>    <fctr>     <chr>
#1      A       red     red_blue
#2      A      blue     red_blue
#3      B     green     green_red
#4      B       red     green_red
#5      C     green     green_yellow
#6      C    yellow     green_yellow

答案 1 :(得分:1)

您可以使用循环

执行相同的操作
for(i in unique(data$ID)){
  data$ex[data$ID==i] <- paste0(data$var.color[data$ID==i], collapse = "_")
}

> data
  ID var.color           ex
1  A       red     red_blue
2  A      blue     red_blue
3  B     green    green_red
4  B       red    green_red
5  C     green green_yellow
6  C    yellow green_yellow