将行添加到另一个data.frame中的特定位置

时间:2016-11-17 23:09:33

标签: r

这是df2的一行,

       PERSON_ID EVENT LATERALITY BEHAV EventAge     DATE
198 10000002174  C569          9     3       64  19890413

我想在适当的位置将此行插入df1,

          PERSON_ID   DOB_rev exact_dob       DATE       exact EVENT LATERALITY               BEHAV
     56 10000002174 4/13/1925       Yes  4/13/1975         Yes   BC1          9 Malignant(Invasive)
     57 10000002174 4/13/1925       Yes 10/13/1975 No_from_age   L_B          .                   .
     58 10000002174 4/13/1925       Yes 10/13/1989 No_from_age   OV1          .                   .
     59 10000002174 4/13/1925       Yes 10/13/1989 No_from_age   OV2          .                   .
     60 10000002174 4/13/1925       Yes 10/13/1993 No_from_age DEATH          .                   .
     61 10000002174 4/13/1925       Yes   6/8/1998         Yes   EPI          .                   .        

必须是“DATE”变量的顺序。所以我希望我的输出是

          PERSON_ID   DOB_rev exact_dob       DATE       exact EVENT LATERALITY               BEHAV
     56 10000002174 4/13/1925       Yes  4/13/1975         Yes   BC1          9 Malignant(Invasive)
     57 10000002174 4/13/1925       Yes 10/13/1975 No_from_age   L_B          .                   .
     58 10000002174 4/13/1925       Yes   19890413 No_from_age  C569          .                   .         
     58 10000002174 4/13/1925       Yes 10/13/1989 No_from_age   OV1          .                   .
     59 10000002174 4/13/1925       Yes 10/13/1989 No_from_age   OV2          .                   .
     60 10000002174 4/13/1925       Yes 10/13/1993 No_from_age DEATH          .                   .
     61 10000002174 4/13/1925       Yes   6/8/1998         Yes   EPI          .                   .        

我想了各种各样的方法,但我最终采取了太复杂的路线并且未能完成,例如从df2获取1行,尝试将其添加到df1并通过PERSON_ID和DATE重新排序。任何人都可以给我任何建议来解决这个问题吗?

1 个答案:

答案 0 :(得分:0)

这将使用lubridate(虽然并非绝对必要)。

library(lubridate)

首先阅读数据:

lines = '      PERSON_ID EVENT LATERALITY BEHAV EventAge     DATE
 10000002174  C569          9     3       64  19890413'

df2 <- read.table(text = lines, header = T)

lines = 'PERSON_ID   DOB_rev exact_dob       DATE       exact EVENT LATERALITY               BEHAV
      10000002174 4/13/1925       Yes  4/13/1975         Yes   BC1          9 Malignant(Invasive)
      10000002174 4/13/1925       Yes 10/13/1975 No_from_age   L_B          .                   .
      10000002174 4/13/1925       Yes 10/13/1989 No_from_age   OV1          .                   .
      10000002174 4/13/1925       Yes 10/13/1989 No_from_age   OV2          .                   .
      10000002174 4/13/1925       Yes 10/13/1993 No_from_age DEATH          .                   .
      10000002174 4/13/1925       Yes   6/8/1998         Yes   EPI          .                   .'

df1 <- read.table(text = lines, header = T)

将日期格式更改为相同(ymdmdy来自lubridate

df1$DATE <- mdy(df1$DATE)
df2$DATE <- ymd(df2$DATE)

合并,保留所有行

df <- merge(df1, df2, all = T)
df <- df[with(df, order(PERSON_ID, DATE)), ]

最终输出(将PERSON_ID更改为与您的输出匹配的字符)

df$PERSON_ID <- as.character(df$PERSON_ID)
df
    PERSON_ID       DATE EVENT LATERALITY               BEHAV   DOB_rev exact_dob       exact EventAge
1 10000002174 1975-04-13   BC1          9 Malignant(Invasive) 4/13/1925       Yes         Yes       NA
2 10000002174 1975-10-13   L_B          .                   . 4/13/1925       Yes No_from_age       NA
3 10000002174 1989-04-13  C569          9                <NA>      <NA>      <NA>        <NA>       64
4 10000002174 1989-10-13   OV1          .                   . 4/13/1925       Yes No_from_age       NA
5 10000002174 1989-10-13   OV2          .                   . 4/13/1925       Yes No_from_age       NA
6 10000002174 1993-10-13 DEATH          .                   . 4/13/1925       Yes No_from_age       NA
7 10000002174 1998-06-08   EPI          .                   . 4/13/1925       Yes         Yes       NA