我正在尝试获取键值组合,并将所有值放在与键相同的行上。我很确定我知道如何在某一点上做到这一点(我认为有data.table)并且我一直在寻找通常的嫌疑人reshape2,tidyr,data.table等,但我似乎无法想象一个简单的解决方案。
key1 = c(1,1,1,1,2,2,2,2)
key2 = c("A","A","B","B","C","C","D","D")
value = c("a","b","c","d","e","f","g","h")
kvframe = data.frame(key1,key2,value)
# key1 key2 value
#1 1 A a
#2 1 A b
#3 1 B c
#4 1 B d
#5 2 C e
#6 2 C f
#7 2 D g
#8 2 D h
这就是我希望表格的样子:
# key1 key2 value1 value2
# 1 A a b
# 1 B c d
# 2 C e f
# 2 D g h
大多数key1,key2对具有相同数量的相应值,但并非所有值都相同。我希望找到一个解决方案,其值列的数量等于任何给定键集的最大值,其中任何具有较少值的对都用NA填充。
答案 0 :(得分:4)
您需要一个序列列'key1 / key2'。
library(data.table) # v1.9.5+
setDT(kvframe)[, Seq := paste0('value', 1:.N), by = .(key1, key2)] # generate Seq
dcast(kvframe, key1 + key2 ~Seq, value.var = 'value') # cast from long to wide
# key1 key2 value1 value2
#1: 1 A a b
#2: 1 B c d
#3: 2 C e f
#4: 2 D g h
或使用reshape
base R
d1 <- transform(kvframe, Seq=ave(seq_along(value),
key1, key2, FUN=seq_along))
reshape(d1, idvar=c('key1', 'key2'), timevar='Seq', direction='wide')
# key1 key2 value.1 value.2
#1 1 A a b
#3 1 B c d
#5 2 C e f
#7 2 D g h
或者
library(tidyr)
spread(d1, Seq, value)