根据多列中的行创建/更新列

时间:2017-12-26 06:57:11

标签: r data.table

我创建了以下data.table个对象;

        V1         V2      V3       V4
     1: 693 -0.2842529  1.3710 21.64843
     2: 240 -2.6564554 -0.5647 93.37038
     3:  43 -2.4404669  0.3631 92.63883
     4: 140  1.3201133  0.6329 73.67534
     5: 216 -0.3066386  1.3710 33.97413
     6: 479 -1.7813084 -0.5647 51.99127
     7: 197 -0.1719174  0.3631 74.65349
     8: 720  1.2146747  0.6329 62.29676
     9:   7  1.8951935  1.3710 62.99829
    10: 375 -0.4304691 -0.5647 22.49861
    11: 514 -0.2572694  0.3631 22.44016
    12:   1 -1.7631631  0.6329 39.50556

我想根据列V1-V4的值生成新的分类/组列。例如,我使用V1中的值生成分类V5列,如下所示,

DT[V1>0.1, V5 :="A"]
DT[V1>10, V5 :="B"]

然后,我得到了这张桌子;

        V1         V2      V3       V4 V5
    1: 693 -0.2842529  1.3710 21.64843  B
    2: 240 -2.6564554 -0.5647 93.37038  B
    3:  43 -2.4404669  0.3631 92.63883  B
    4: 140  1.3201133  0.6329 73.67534  B
    5: 216 -0.3066386  1.3710 33.97413  B
    6: 479 -1.7813084 -0.5647 51.99127  B
    7: 197 -0.1719174  0.3631 74.65349  B
    8: 720  1.2146747  0.6329 62.29676  B
    9:   7  1.8951935  1.3710 62.99829  A
    10: 375 -0.4304691 -0.5647 22.49861  B
    11: 514 -0.2572694  0.3631 22.44016  B
    12:   1 -1.7631631  0.6329 39.50556  A

是否可以将上面两行组合成一行?是否可以将其与多个其他列中的值(例如V2-V4)结合使用?

2 个答案:

答案 0 :(得分:1)

如果我们有多个级别,最好使用cut

DT[, V5 := as.character(cut(V1, breaks = c(0.1, 10, Inf), labels = c("A", "B")))]
DT$V5
#[1] "B" "B" "B" "B" "B" "B" "B" "B" "A" "B" "B" "A"

findInterval

DT[, V5 := LETTERS[findInterval(V1, c(0.1, 10))]]

答案 1 :(得分:0)

您是否更喜欢tidyverse方法,可以运行以下代码:

library(tidyverse)           
new_data <- your_data %>% 
    mutate(V5=case_when(
        V1>=0.1 & V1<10 ~ "A",
        V1>=10 ~ "B"
    ))