重塑文件:更少的行-更多的列

时间:2018-11-27 16:03:19

标签: r reshape

我有一个基本上像这样的文件。

1  A 
2  A 
2  B 
3  A 
3  B 
3  C 
4  A 
4  C 
...

我想要一个像这样的文件

1  A 
2  A  B 
3  A  B  C 
4  A  C 
...

我尝试在R中使用重塑工具,但是没有用...

reshape(df, idvar = V1, timevar = V2, direction = "wide")

出现以下错误:

In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :  multiple rows match for V2=A: first taken 
In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :  multiple rows match for V2=B: first taken   
In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :  multiple rows match for V2=C: first taken 

R或linux中的解决方案受到高度赞赏。谢谢!

1 个答案:

答案 0 :(得分:0)

df <- read.table(header=FALSE, stringsAsFactors=FALSE, text="
1  A 
2  A 
2  B 
3  A 
3  B 
3  C 
4  A 
4  C ")

方法1:dplyr

library(dplyr)
library(tidyr)
df %>%
  group_by(V1) %>%
  mutate(rn = row_number()) %>%
  spread(rn, V2)
# # A tibble: 4 x 4
# # Groups:   V1 [4]
#      V1 `1`   `2`   `3`  
#   <int> <chr> <chr> <chr>
# 1     1 A     <NA>  <NA> 
# 2     2 A     B     <NA> 
# 3     3 A     B     C    
# 4     4 A     C     <NA> 

方法2:data.table

library(data.table)
DT <- as.data.table(df)[,rn := seq_len(.N),by="V1"]
dcast(DT, V1 ~ rn, value.var = "V2")
#    V1 1    2    3
# 1:  1 A <NA> <NA>
# 2:  2 A    B <NA>
# 3:  3 A    B    C
# 4:  4 A    C <NA>