R编程(提取与第一列匹配的相同元素)

时间:2018-07-09 08:44:18

标签: r

我有以下列表:

  

head(输入)

   V1 V2 V3 V4 V5 V6

1  A  1  2  3  4  5

2  B  1  2 NA NA NA

3  C  3  5 NA NA NA

4  D  3 NA NA NA NA

5  E  4  5  6  1  8

我想得到下面的结果(将所有V1元素匹配到V2〜V6中的唯一元素):

1   A   B   E

2   A   B   

3   A   C   D

4   A   E   

5   A   C   E

6   E       

8   E

我试图用R编写代码,但我不断收到错误消息。您能帮忙这段代码吗?

谢谢。

1 个答案:

答案 0 :(得分:0)

一个选择是将gather设置为'long'格式,然后按'val'分组的arrange行创建一个序列列,将spread设置为'wide'格式< / p>

library(tidyverse)
gather(df1, key, val, -V1, na.rm = TRUE) %>% 
      arrange(val, V1) %>%
      group_by(val)  %>% 
      mutate(rn = paste0("Col", row_number())) %>%
      select(-key) %>% 
      spread(rn, V1) 
# A tibble: 7 x 4
# Groups:   val [7]
#    val Col1  Col2  Col3 
#  <int> <chr> <chr> <chr>
#1     1 A     B     E    
#2     2 A     B     NA   
#3     3 A     C     D    
#4     4 A     E     NA   
#5     5 A     C     E    
#6     6 E     NA    NA   
#7     8 E     NA    NA   

数据

df1 <- structure(list(V1 = c("A", "B", "C", "D", "E"), V2 = c(1L, 1L, 
3L, 3L, 4L), V3 = c(2L, 2L, 5L, NA, 5L), V4 = c(3L, NA, NA, NA, 
6L), V5 = c(4L, NA, NA, NA, 1L), V6 = c(5L, NA, NA, NA, 8L)), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6"), class = "data.frame",
 row.names = c("1", "2", "3", "4", "5"))