使用r

时间:2018-02-10 19:21:25

标签: r replace dplyr vlookup

虽然我发现很少讨论但在dplyr中找不到合适的解决方案。

我的主表包含50多列,并有15个查找表。每个查找表大约有8-15列。我有多个查找要执行,因为它变得非常混乱使用select语句(通过选择或删除减号),我希望能够动态替换列值。

使用dplyr可以吗?我在下面提供了一个示例数据,以便更好地理解。

我想在查找中使用中的城市 lcity 进行VLOOKUP(例如excel)用 newcity 替换 city 的值。

> table <- data.frame(name = c("a","b","c","d","e","f"), city = c("hyd","sbad","hyd","sbad","others","unknown"), rno = c(101,102,103,104,105,106),stringsAsFactors=FALSE)
>lookup <- data.frame(lcity = c("hyd","sbad","others","test"),newcity = c("nhyd","nsbad","nothers","ntest"),rating = c(10,20,40,55),newrating = c(100,200,400,550), stringsAsFactors = FALSE)
> table
  name    city rno
1    a     hyd 101
2    b    sbad 102
3    c     hyd 103
4    d    sbad 104
5    e  others 105
6    f unknown 106
> lookup
   lcity newcity rating newrating
1    hyd    nhyd     10       100
2   sbad   nsbad     20       200
3 others nothers     40       400
4   test   ntest     55       550

我的输出表应该是

  name    city rno
1    a    nhyd 101
2    b   nsbad 102
3    c    nhyd 103
4    d   nsbad 104
5    e nothers 105
6    f    <NA> 106

我已尝试使用下面的代码来动态更新值,但这会创建另一个数据帧/表而不是字符向量

table$city <- select(left_join(table,lookup,by=c("city"="lcity")),"newcity")

1 个答案:

答案 0 :(得分:1)

一种解决方案可能是:

注意:OP显示的数据和使用命令创建的数据对lookup是不同的。我已经通过OP使用了表格格式lookup显示的数据。

library(dplyr)
# Data from OP
table <- data.frame(name = c("a","b","c","d","e","f"), 
    city = c("hyd","sbad","hyd","sbad","others","unknown"), 
 rno = c(101,102,103,104,105,106),stringsAsFactors=FALSE)
lookup <- data.frame(lcity = c("hyd","sbad","others","test"), 
newcity = c("nhyd","nsbad","nothers","ntest"), 
rating = c(10,20,40,55),newrating = c(100,200,400,550), 
 stringsAsFactors = FALSE)

table %>% 
  inner_join(lookup, by = c("city" = "lcity")) %>%
  mutate(city = newcity) %>%
  select(name,   city, rno)

  name    city rno
1    a    nhyd 101
2    b   nsbad 102
3    c    nhyd 103
4    d   nsbad 104
5    e nothers 105