我有一张桌子:
types <- c("ENR","ENR","ENR","ENR","ENR","ENR")
records <- c(1,1,1,1,2,2)
occur <- c(1,2,3,4,1,2)
myval <- c("ABC|123","DEF|456","GHI|789","JKL|123","MNO|456","PQR|789")
mydf <- data.frame(types, records, occur, myval)
type record occur myval
ENR 1 1 ABC|123
ENR 1 2 DEF|456
ENR 1 3 GHI|789
ENR 1 4 JKL|123
ENR 2 1 MNO|456
ENR 2 2 PQR|789
我正在解析myval列,以便分隔的字段有自己的列,这是我到目前为止使用的
library(tidyr)
mydf <- mydf %>% separate(myval, c("letters","numbers"),"\\|")
这基本上有效,它创造了这个:
types records occur letters numbers
1 ENR 1 1 ABC 123
2 ENR 1 2 DEF 456
3 ENR 1 3 GHI 789
4 ENR 1 4 JKL 123
5 ENR 2 1 MNO 456
6 ENR 2 2 PQR 789
....但是,我希望列名称基于发生的#是动态的,所以我理想地喜欢这样:
types records occur letters1 numbers1 letters2 numbers2 letters3 numbers3 letters4 numbers4
ENR 1 1 ABC 123
ENR 1 2 DEF 456
ENR 1 3 GHI 789
ENR 1 4 JKL 123
ENR 2 1 MNO 456
ENR 2 2 DEF 456
任何想法如何实现这一目标?我在想是否可以动态命名那些可能起作用的列?
答案 0 :(得分:1)
您可以使用tidyr::spread()
mydf %>% dplyr::mutate(letters_ = occur, numbers_ = occur) %>%
spread(letters_, letters, fill = "", sep = "") %>%
spread(numbers_, numbers, fill = "", sep = "")
为了保持orignal occur
变量,我将其变为三倍,然后使用spread()
函数,根据出现的副本值旋转字母和数字的值。
请注意,使用sep
参数会粘贴新变量名称中的键和值。 fill
参数仅用于获取所需的输出。
types records occur letters_1 letters_2 letters_3 letters_4 numbers_1 numbers_2 numbers_3 numbers_4
1 ENR 1 1 ABC 123
2 ENR 1 2 DEF 456
3 ENR 1 3 GHI 789
4 ENR 1 4 JKL 123
5 ENR 2 1 MNO 456
6 ENR 2 2 PQR 789
答案 1 :(得分:1)
我们可以使用dcast
中的data.table
,value.var
可以使用多个library(data.table)
dcast(setDT(mydf), types + records + occur ~ occur, value.var = c("letters", "numbers"), fill="")
# types records occur letters_1 letters_2 letters_3 letters_4 numbers_1 numbers_2 numbers_3 numbers_4
#1: ENR 1 1 ABC 123
#2: ENR 1 2 DEF 456
#3: ENR 1 3 GHI 789
#4: ENR 1 4 JKL 123
#5: ENR 2 1 MNO 456
#6: ENR 2 2 PQR 789
列
(char *)mem