我有一个包含300列和1000行的数据集以及data.table
格式的相应代码簿。为简单起见,我将为两者提供3列。
dt <- data.table(id = 1:10,
a = sample(c(1,2,3),10, replace = T),
b = sample(c(1,2) ,10, replace = T),
c = sample(c(1:5) ,10, replace = T))
id a b c
1: 1 2 1 2
2: 2 2 1 1
3: 3 3 1 1
4: 4 3 1 1
5: 5 1 2 5
6: 6 2 1 3
7: 7 1 2 3
8: 8 1 1 2
9: 9 2 1 5
10: 10 3 2 4
cb <- data.table(var = c(rep("a", 3), rep("b", 2), rep("c", 5)),
val = c(1,2,3,1,2,1,2,3,4,5),
des = c("red", "blue", "yellow", "yes","no","K", "Na","Ag","Au","Si"))
var val des
1: a 1 red
2: a 2 blue
3: a 3 yellow
4: b 1 yes
5: b 2 no
6: c 1 K
7: c 2 Na
8: c 3 Ag
9: c 4 Au
10: c 5 Si
在cb
中,var
是dt
中的相应变量,val
是dt
中具有相应des
的值值。我想通过将dt
中的值替换为dt
中的值来修改cb
。它应该看起来像
id a b c
1: 1 red yes Na
2: 2 yellow no Ag
3: 3 blue yes Ag
4: 4 red yes Au
5: 5 blue yes Ag
6: 6 blue no Au
7: 7 yellow yes Si
8: 8 blue no Ag
9: 9 red no K
10: 10 yellow no Ag
如何有效地执行这样的操作,并且听起来不像我的计算机内置活塞?
原因是我有一个预先编写的代码来分析数据并需要实际值才能运行它。它也可能在一般情况下有用,因为很多时候我都会获得数据和代码簿,但通常它们不是很多变量。
答案 0 :(得分:3)
你可以尝试
dcast(melt(dt, 1, var="var", val="val")[cb, on=c("var","val")], id~var, value.var="des")
# id a b c
# 1: 1 red yes K
# 2: 2 yellow no Si
# 3: 3 red yes Si
# 4: 4 red no Au
# 5: 5 red no Ag
# 6: 6 blue yes K
# 7: 7 blue no Si
# 8: 8 yellow yes Na
# 9: 9 blue yes Ag
# 10: 10 yellow yes Si
答案 1 :(得分:3)
另一种选择是进行多次合并+更新:
cb_dc <- data.table::dcast(cb, des~var, value.var = "val")
cols = c("a","b","c")
dt[, (cols) := lapply(cols, function(x) cb_dc[dt, des, on = x]) ]
# id a b c
#1: 1 red yes Si
#2: 2 blue yes Na
#3: 3 blue no Au
#4: 4 yellow yes K
#5: 5 red no Na
#6: 6 yellow yes Na
#7: 7 yellow no K
#8: 8 blue no Na
#9: 9 blue yes Si
#10: 10 red no Na
数据:强>
set.seed(1)
dt <- data.table(id = 1:10,
a = sample(c(1,2,3),10, replace = T),
b = sample(c(1,2) ,10, replace = T),
c = sample(c(1:5) ,10, replace = T))
答案 2 :(得分:1)
这个dplyr答案essentialy连接一个子表一次为三列。
library(dplyr)
dt %>%
left_join(cb %>% filter(var == "a"), by=c("a" = "val")) %>%
left_join(cb %>% filter(var == "b"), by=c("b" = "val")) %>%
left_join(cb %>% filter(var == "c"), by=c("c" = "val")) %>%
select(id, des.x, des.y, des) %>%
rename(a = des.x, b = des.y, c = des)