用dplyr重组df

时间:2018-10-29 09:46:32

标签: r dplyr

我正在尝试用dplyr进行重组操作,但无法解决这一问题,也许你们中的一个可以提供帮助:)

    df <- data.frame(
  gene = c("ABC", "ABC", "AA", "AB", "AC", "DD", "DE", "AA", "AR", "ABC"),
  genotype = c("ht", "cpht", "ht", "cpht", "hm", "hm", "cpht", "ht", "hm", "cpht"),
  consequence = c("utr3", "miss", "miss", "stop", "utr5", "miss", "stop", "miss", "utr3", "utr5")
)

我想创建一个新的df,其外观应如下所示: df I want

应该使用dplyr轻松完成此操作,但我无法使其正常工作。也许你们可以之一?

非常感谢! 塞巴斯蒂安

1 个答案:

答案 0 :(得分:2)

您可以尝试以下方法:

df %>% 
  group_by(gene,genotype) %>%
  summarise(consequence=paste(consequence,collapse=",")) %>%
  spread(genotype,consequence)

## A tibble: 7 x 4
## Groups:   gene [7]  
#  gene  cpht      hm    ht       
#  <fct> <chr>     <chr> <chr>    
#1 AA    <NA>      <NA>  miss,miss
#2 AB    stop      <NA>  <NA>     
#3 ABC   miss,utr5 <NA>  utr3     
#4 AC    <NA>      utr5  <NA>     
#5 AR    <NA>      utr3  <NA>     
#6 DD    <NA>      miss  <NA>     
#7 DE    stop      <NA>  <NA>

您的数据,如您的帖子中所述:

  df <- data.frame(
  gene = c("ABC", "ABC", "AA", "AB", "AC", "DD", "DE", "AA", "AR", "ABC"),
  genotype = c("ht", "cpht", "ht", "cpht", "hm", "hm", "cpht", "ht", "hm", "cpht"),
  consequence = c("utr3", "miss", "miss", "stop", "utr5", "miss", "stop", "miss", "utr3", "utr5")
 )
 df
#   gene genotype consequence
#1   ABC       ht        utr3
#2   ABC     cpht        miss
#3    AA       ht        miss
#4    AB     cpht        stop
#5    AC       hm        utr5
#6    DD       hm        miss
#7    DE     cpht        stop
#8    AA       ht        miss
#9    AR       hm        utr3
#10  ABC     cpht        utr5