虽然我发现很少讨论但在dplyr中找不到合适的解决方案。
我的主表包含50多列,并有15个查找表。每个查找表大约有8-15列。我有多个查找要执行,因为它变得非常混乱使用select语句(通过选择或删除减号),我希望能够动态替换列值。
使用dplyr可以吗?我在下面提供了一个示例数据,以便更好地理解。
我想在查找中使用表中的城市和 lcity 进行VLOOKUP(例如excel)用 newcity 替换 city 的值。
> table <- data.frame(name = c("a","b","c","d","e","f"), city = c("hyd","sbad","hyd","sbad","others","unknown"), rno = c(101,102,103,104,105,106),stringsAsFactors=FALSE)
>lookup <- data.frame(lcity = c("hyd","sbad","others","test"),newcity = c("nhyd","nsbad","nothers","ntest"),rating = c(10,20,40,55),newrating = c(100,200,400,550), stringsAsFactors = FALSE)
> table
name city rno
1 a hyd 101
2 b sbad 102
3 c hyd 103
4 d sbad 104
5 e others 105
6 f unknown 106
> lookup
lcity newcity rating newrating
1 hyd nhyd 10 100
2 sbad nsbad 20 200
3 others nothers 40 400
4 test ntest 55 550
我的输出表应该是
name city rno
1 a nhyd 101
2 b nsbad 102
3 c nhyd 103
4 d nsbad 104
5 e nothers 105
6 f <NA> 106
我已尝试使用下面的代码来动态更新值,但这会创建另一个数据帧/表而不是字符向量
table$city <- select(left_join(table,lookup,by=c("city"="lcity")),"newcity")
答案 0 :(得分:1)
一种解决方案可能是:
注意:OP显示的数据和使用命令创建的数据对lookup
是不同的。我已经通过OP使用了表格格式lookup
显示的数据。
library(dplyr)
# Data from OP
table <- data.frame(name = c("a","b","c","d","e","f"),
city = c("hyd","sbad","hyd","sbad","others","unknown"),
rno = c(101,102,103,104,105,106),stringsAsFactors=FALSE)
lookup <- data.frame(lcity = c("hyd","sbad","others","test"),
newcity = c("nhyd","nsbad","nothers","ntest"),
rating = c(10,20,40,55),newrating = c(100,200,400,550),
stringsAsFactors = FALSE)
table %>%
inner_join(lookup, by = c("city" = "lcity")) %>%
mutate(city = newcity) %>%
select(name, city, rno)
name city rno
1 a nhyd 101
2 b nsbad 102
3 c nhyd 103
4 d nsbad 104
5 e nothers 105