STRING_ELT()只能应用于“字符向量”,而不能应用于“整数”

时间:2018-12-20 12:28:59

标签: r dplyr

我有两个数据帧$('#textarea').val('some text').trigger('paste'); a,并且想要相互比较某些列。一切正常,直到出现错误:

b

我的代码:

Error in mutate_impl(.data, dots) : 
Evaluation error: STRING_ELT() can only be applied to a 'character vector', not a 'integer'.

1 个答案:

答案 0 :(得分:1)

没有示例数据很难确定,但是当b和/或a中的名称列是一个因素时,我可以重现该错误。

一种解决方案是使用软件包stringdist中的stringdist函数:

a <- data.frame(names = c("foo", "bar", "aargh"), stringsAsFactors = FALSE)
b <- data.frame(wholename= c("foob", "baar", "flierp"), stringsAsFactors = FALSE)

lookup <- expand.grid(target = a$names, source = b$wholename, stringsAsFactors = FALSE)

y <-lookup %>% group_by(target) %>%
   mutate(match_score = stringdist::stringdist(target, source, method = "jw"))  %>%
   summarise(match = match_score[which.max(match_score)], matched_to = 
   source[which.max(match_score)])  %>%
   inner_join(b, by = c("matched_to" = "wholename"))

另一种解决方案是使用reclin包(我是作者):

library(reclin)

names(b) <- "names"

pair_blocking(a, b) %>% 
  compare_pairs(by = c("names"), default_comparator = jaro_winkler()) %>% 
  select_n_to_m(weight = "names") %>% 
  link()