如何将此垂直数据集转换为水平数据集?

时间:2019-09-10 05:38:23

标签: r transpose data-handling

我想更改此数据集:

        id PTMIINDT PTMIINTM DGOTDIAG DGOTDGGB
1: ys00000001 20160101      614     R060        1
2: ys00000002 20160101      640    S0090        1
3: ys00000002 20160101      640     A090        2
4: ys00000003 20160101      959      R42        1
5: ys00000007 20160101     1111    S0600        1
6: ys00000008 20160101     1253     R558        1

此数据集:

         id     PTMIINDT PTMIINTM DGOTDIAG01 DGOTDGGB01 DGOTDIAG02  DGOTDGGB02
1 ys00000001      20160101     614      R060          1         NA          NA
2 ys00000002      20160101     640     S0090          1       A090           2
.     .               .         .
.     .               .         .
.     .               .         .

像这样。

我尝试使用mutate函数设置此数据集。但效果不佳。 我该如何更改数据集?

ba<-n6 %>% group_by(id,PTMIINDT,PTMIINTM) %>% 
  mutate(DGOTDIAG01=DGOTDIAG, DGOTDIAG02=DGOTDIAG, DGOTDGGB01=DGOTDGGB,DGOTDGGB02=DGOTDGGB)


ba<-n6 %>% group_by(id,PTMIINDT,PTMIINTM) %>% 
  mutate(DGOTDIAG01=DGOTDIAG, DGOTDIAG02=DGOTDIAG, DGOTDGGB01=DGOTDGGB,DGOTDGGB02=DGOTDGGB)

         id     PTMIINDT PTMIINTM DGOTDIAG01 DGOTDGGB01 DGOTDIAG02  DGOTDGGB02
1 ys00000001      20160101     614      R060          1         NA          NA
2 ys00000002      20160101     640     S0090          1       A090           2
.     .               .         .
.     .               .         .
.     .               .         .

3 个答案:

答案 0 :(得分:1)

tidyr的开发版本具有一个新动词{​​{1}},它更适合此任务。

https://tidyr.tidyverse.org/dev/articles/pivot.html

与此同时,您可以收集,转换和传播:

pivot_wider

答案 1 :(得分:1)

使用data.table::dcast(),可以使用以下单线实现。

library(data.table)

样本数据

dt <- data.table::fread("id PTMIINDT PTMIINTM DGOTDIAG DGOTDGGB
ys00000001 20160101      614     R060        1
ys00000002 20160101      640    S0090        1
ys00000002 20160101      640     A090        2
ys00000003 20160101      959      R42        1
ys00000007 20160101     1111    S0600        1
ys00000008 20160101     1253     R558        1")

代码

data.table::dcast( dt, id + PTMIINDT + PTMIINTM ~ DGOTDGGB, value.var = c("DGOTDIAG", "DGOTDGGB") )

输出

#            id PTMIINDT PTMIINTM DGOTDIAG_1 DGOTDIAG_2 DGOTDGGB.1_1 DGOTDGGB.1_2
# 1: ys00000001 20160101      614       R060       <NA>            1           NA
# 2: ys00000002 20160101      640      S0090       A090            1            2
# 3: ys00000003 20160101      959        R42       <NA>            1           NA
# 4: ys00000007 20160101     1111      S0600       <NA>            1           NA
# 5: ys00000008 20160101     1253       R558       <NA>            1           NA

答案 2 :(得分:0)

使用tidyr的最新版本(1.0.0,已在CRAN上):

library(tidyr)
library(dplyr)
n6 %>% 
    group_by(id) %>% 
    dplyr::mutate(sbs = row_number()) %>% 
    pivot_wider(names_from = sbs, values_from = c(DGOTDIAG,DGOTDGGB))

# A tibble: 5 x 7
# Groups:   id [5]
  id         PTMIINDT PTMIINTM DGOTDIAG_1 DGOTDIAG_2 DGOTDGGB_1 DGOTDGGB_2
  <fct>         <dbl>    <dbl> <fct>      <fct>           <dbl>      <dbl>
1 ys00000001 20160101      614 R060       NA                  1         NA
2 ys00000002 20160101      640 S0090      A090                1          2
3 ys00000003 20160101      959 R42        NA                  1         NA
4 ys00000007 20160101     1111 S0600      NA                  1         NA
5 ys00000008 20160101     1253 R558       NA                  1         NA