我有两个数据帧$('#textarea').val('some text').trigger('paste');
和a
,并且想要相互比较某些列。一切正常,直到出现错误:
b
我的代码:
Error in mutate_impl(.data, dots) :
Evaluation error: STRING_ELT() can only be applied to a 'character vector', not a 'integer'.
答案 0 :(得分:1)
没有示例数据很难确定,但是当b和/或a中的名称列是一个因素时,我可以重现该错误。
一种解决方案是使用软件包stringdist
中的stringdist
函数:
a <- data.frame(names = c("foo", "bar", "aargh"), stringsAsFactors = FALSE)
b <- data.frame(wholename= c("foob", "baar", "flierp"), stringsAsFactors = FALSE)
lookup <- expand.grid(target = a$names, source = b$wholename, stringsAsFactors = FALSE)
y <-lookup %>% group_by(target) %>%
mutate(match_score = stringdist::stringdist(target, source, method = "jw")) %>%
summarise(match = match_score[which.max(match_score)], matched_to =
source[which.max(match_score)]) %>%
inner_join(b, by = c("matched_to" = "wholename"))
另一种解决方案是使用reclin
包(我是作者):
library(reclin)
names(b) <- "names"
pair_blocking(a, b) %>%
compare_pairs(by = c("names"), default_comparator = jaro_winkler()) %>%
select_n_to_m(weight = "names") %>%
link()