如果满足语句,如何在数据集中插入新列

时间:2015-04-16 12:36:35

标签: r

我有一个大型数据集,如果满足以下条件,我想在数据集中插入一个二进制值(0& 1)的新列。

如果df1$seg.mean >= 0.5的列等于df1$id == gaindf1$seg.mean <= -0.5等于df1$id == loss,请在df1$Occurance中插入1。 对于那些不符合此条件的行,请指定df1$Occurance == 0

df1 <-
    Chr start       end     num.mark    seg.mean    id
    1   68580000    68640000    8430    0.7       gain
    1   115900000   116260000   8430    0.0039    loss
    1   173500000   173680000   5      -1.7738    loss
    1   173500000   173680000   12       0.011    loss
    1   173840000   174010000   6      -1.6121    loss

期望的输出

Chr     start       end     num.mark    seg.mean  id    Occurance
    1   68580000    68640000    8430    0.7       gain      1
    1   115900000   116260000   8430    0.0039    loss      0
    1   173500000   173680000   5      -1.7738    loss      1
    1   173500000   173680000   12       0.011    loss      0
    1   173840000   174010000   6      -1.6121    loss      1

3 个答案:

答案 0 :(得分:4)

尝试使用ifelse

df1$Occurance <- ifelse((df1$seg.mean >= 0.5 & df1$id == "gain") | 
                      (df1$seg.mean <= -0.5 & df1$id == "loss"), 1, 0)

修改:避免使用ifelse并使用within,因为您无法一直使用df1

transform(df1, Occurance = as.numeric((seg.mean >= 0.5 & id == "gain") |
                                        (seg.mean <= -0.5 & id == "loss")))

评论:如果你也接受1/0的TRUE / FALSE,你可以跳过as.numeric

编辑#2:如果你想有多个结果,比如-1,0,1你可以做以下

df1$Occurance = 0
within(df1, {Occurance[seg.mean >= 0.5 & id == "gain"] <- 1;
             Occurance[seg.mean <= -0.5 & id == "loss"] <- -1})

导致

  Chr     start       end num.mark seg.mean   id Occurance
1   1  68580000  68640000     8430   0.7000 gain         1
2   1 115900000 116260000     8430   0.0039 loss         0
3   1 173500000 173680000        5  -1.7738 loss        -1
4   1 173500000 173680000       12   0.0110 loss         0
5   1 173840000 174010000        6  -1.6121 loss        -1

答案 1 :(得分:2)

试试这个:

df1$Occurance <- (df1$seg.mean >= 0.5 & df1$id == "gain") | 
                  (df1$seg.mean <= -0.5 & df1$id == "loss"))*1

# TRUE*1 = 1
# FALSE*1 = 0

答案 2 :(得分:-1)

你也可以这样做:

   df1$Occurrence[with(df1,(seg.mean>=.5 & id == "gain") | (seg.mean<=-.5 & id=="loss"))]<-1
   df1$Occurrence[is.na(df1$Occurrence)]<-0