使列值为零的字符串

时间:2016-06-09 19:45:42

标签: r data.table dplyr grepl

第4栏是我想要的专栏。视频,网络研讨会,会议,会议是不同客户(名称)可以参与的4种类型的活动。您可以看到,在给定的行中,所有具有零值的列名称都在最后一列(NextStep)和value(由逗号分隔的字符串)排除具有非零值的列名。最后一列中的字符串(列名称)通常以列顺序显示,但有两个例外。如果网络研讨会的值为零,则始终首先显示,如果视频的值为零,则视频始终显示在最后。

    library(data.table)
     dt <- fread('
 Name     Video   Webinar Meeting Conference   NextStep
  John       1         0        0       0         Webinar,Meeting,Conference
  John       1         1        0       0         Meeting,Conference
  John       1         1        1       0         Conference      
  Tom        0         0        1       0         Webinar,Conference,Video
  Tom        0         0        1       1         Webinar,Video   
  Kyle       0         0        0       1         Webinar,Meeting,Video

                                    ')

我的问题是如何创建下一步列。非常感谢你的帮助!

3 个答案:

答案 0 :(得分:4)

如果你正在寻找一种方法来做到这一点,而不是简单地按照你想要的顺序重新排序列(实际上我没有理由不这样做,但无论如何..)你可以尝试以下方法。它melt并通过引用在连接中更新:

lvls <- c("Webinar", "Meeting", "Conference", "Video")  # make sure order is correct
dt[, row := .I]   # add a row-identifier
dtm <- melt(dt, id.vars = c("Name", "row"), measure.vars = lvls) # melt to long format
# summarise dtm by using factor, sorting it and converting to strin; then join to dt
dt[dtm[value == 0, list(NextStep2 = toString(sort(factor(variable, levels = lvls)))), 
    by = row], NextStep2 := NextStep2, on = "row"][, row := NULL]

#    Name Video Webinar Meeting Conference                   NextStep                    NextStep2
# 1: John     1       0       0          0 Webinar,Meeting,Conference Webinar, Meeting, Conference
# 2: John     1       1       0          0         Meeting,Conference          Meeting, Conference
# 3: John     1       1       1          0                 Conference                   Conference
# 4:  Tom     0       0       1          0   Webinar,Conference,Video   Webinar, Conference, Video
# 5:  Tom     0       0       1          1              Webinar,Video               Webinar, Video
# 6: Kyle     0       0       0          1      Webinar,Meeting,Video      Webinar, Meeting, Video

如果要将数据中的所有列名称粘贴到没有活动的情况下,可以在代码中添加以下行:

dt[rowSums(dt[, mget(lvls)]) == 0, NextStep2 := toString(names(dt)[2:5])]

答案 1 :(得分:3)

可能的解决方案:

DT[, nextstep := paste0(names(.SD)[.SD==0], collapse = ','), 1:nrow(DT), .SDcols = 2:5][]

给出:

   Name Video Webinar Meeting Conference                   nextstep
1: John     1       0       0          0 Webinar,Meeting,Conference
2: John     1       1       0          0         Meeting,Conference
3: John     1       1       1          0                 Conference
4:  Tom     0       0       1          0   Video,Webinar,Conference
5:  Tom     0       0       1          1              Video,Webinar
6: Kyle     0       0       0          1      Video,Webinar,Meeting

如果您想按照评论中的指定订购名称,可以执行以下操作:

lvls <- c('Webinar', 'Meeting', 'Conference', 'Video')
DT[, nextstep := paste0(lvls[lvls %in% names(.SD)[.SD==0]], collapse = ','), 
   1:nrow(DT), .SDcols = 2:5][]

给出:

   Name Video Webinar Meeting Conference                   nextstep
1: John     1       0       0          0 Webinar,Meeting,Conference
2: John     1       1       0          0         Meeting,Conference
3: John     1       1       1          0                 Conference
4:  Tom     0       0       1          0   Webinar,Conference,Video
5:  Tom     0       0       1          1              Webinar,Video
6: Kyle     0       0       0          1      Webinar,Meeting,Video

您也可以使用paste0而不是collapse = ','toString)。

使用过的数据:

DT <- fread('Name     Video   Webinar  Meeting  Conference
             John       1         0        0        0
             John       1         1        0        0
             John       1         1        1        0
             Tom        0         0        1        0
             Tom        0         0        1        1
             Kyle       0         0        0        1')

答案 2 :(得分:1)

你走了:

setcolorder(dt, c("Name", "Webinar", "Meeting", "Conference", "Video", "NextStep"))
dt[, NextStepNew:=apply(dt, 1, function(x) paste0(names(x)[x==0], collapse=","))][]
   Name Webinar Meeting Conference Video                   NextStep                NextStepNew
1: John       0       0          0     1 Webinar,Meeting,Conference Webinar,Meeting,Conference
2: John       1       0          0     1         Meeting,Conference         Meeting,Conference
3: John       1       1          0     1                 Conference                 Conference
4:  Tom       0       1          0     0   Webinar,Conference,Video   Webinar,Conference,Video
5:  Tom       0       1          1     0              Webinar,Video              Webinar,Video
6: Kyle       0       0          1     0      Webinar,Meeting,Video      Webinar,Meeting,Video