我正在尝试用dplyr进行重组操作,但无法解决这一问题,也许你们中的一个可以提供帮助:)
df <- data.frame(
gene = c("ABC", "ABC", "AA", "AB", "AC", "DD", "DE", "AA", "AR", "ABC"),
genotype = c("ht", "cpht", "ht", "cpht", "hm", "hm", "cpht", "ht", "hm", "cpht"),
consequence = c("utr3", "miss", "miss", "stop", "utr5", "miss", "stop", "miss", "utr3", "utr5")
)
应该使用dplyr轻松完成此操作,但我无法使其正常工作。也许你们可以之一?
非常感谢! 塞巴斯蒂安
答案 0 :(得分:2)
您可以尝试以下方法:
df %>%
group_by(gene,genotype) %>%
summarise(consequence=paste(consequence,collapse=",")) %>%
spread(genotype,consequence)
## A tibble: 7 x 4
## Groups: gene [7]
# gene cpht hm ht
# <fct> <chr> <chr> <chr>
#1 AA <NA> <NA> miss,miss
#2 AB stop <NA> <NA>
#3 ABC miss,utr5 <NA> utr3
#4 AC <NA> utr5 <NA>
#5 AR <NA> utr3 <NA>
#6 DD <NA> miss <NA>
#7 DE stop <NA> <NA>
您的数据,如您的帖子中所述:
df <- data.frame(
gene = c("ABC", "ABC", "AA", "AB", "AC", "DD", "DE", "AA", "AR", "ABC"),
genotype = c("ht", "cpht", "ht", "cpht", "hm", "hm", "cpht", "ht", "hm", "cpht"),
consequence = c("utr3", "miss", "miss", "stop", "utr5", "miss", "stop", "miss", "utr3", "utr5")
)
df
# gene genotype consequence
#1 ABC ht utr3
#2 ABC cpht miss
#3 AA ht miss
#4 AB cpht stop
#5 AC hm utr5
#6 DD hm miss
#7 DE cpht stop
#8 AA ht miss
#9 AR hm utr3
#10 ABC cpht utr5