Question

我理解＆＃34;错误：$运算符对原子向量无效＆＃34;是将$传递给错误的对象：原子而不是递归对象。

然而，当我运行一个适用于其他人的代码块时，我得到它。我试图找到我做错的事情或替代解决方案。

我有两个数据框：

> str(DF1)
'data.frame':   5977 obs. of  1 variable:
 $ coolThings: chr  "Surfing" "wearing sunglasses" "being an honest person" ...

> str(DF2)
'data.frame':   2999 obs. of  1 variable:
 $ coolThings: chr  "Surf" "Sun glasses" "being honest" ...

我的目标是创建一个与DF1和DF2最相似的字符串匹配的第三个数据帧。为此，我正在关注this article on fuzzy matching。

代码的第一部分对我来说很好用：

# It creates a matrix with the Standard Levenshtein distance between the name fields of both sources
dist.coolthings <- adist(DF1$coolThings, DF2$coolThings, partial = TRUE, ignore.case = TRUE)

# We now take the pairs with the minimum distance
min.coolthings <- apply(dist.coolthings , 1, min)

然而，当我进入for循环......

match.c1.c2 <- NULL
for(i in 1:nrow(dist.coolthings))
{
    c2.i <- match(min.coolthings[i],dist.coolthings[i,])
    c1.i <- i
    match.c1.c2<-rbind(data.frame(c2.i=c2.i,c1.i=c1.i,c2coolthings =DF2[c2.i,]$coolThings, c1coolthings=DF1[c1.i,]$coolThings, adist=min.coolthings[i]), match.c1.c2)
}

......我得到了上面提到的错误：

Error: $ operator is invalid for atomic vectors

DF1 [c1.i，]和DF2 [c2.i，]确实是原子的：

> is.atomic(DF1[c1.i,])
[1] TRUE

所以我得到这个错误是有意义的，但是......如何避免它？

我正在使用其他人的代码，我不熟悉它使用的某些表达方式，也许有经验丰富的人可以帮助我。

提前致谢，

纪莲

Answer 1

您遇到了问题，因为DF1和DF2都是单列data.frame。当您对单个列data.frame进行子集化时，它会删除data.frame类并返回一个向量。因此使用$不会起作用。要解决这个问题，您可以执行以下任一操作：

在子集中选择所需的特定列。对于DF2的代码，这将是DF2[c2.1, "coolThings"]。

或

使用drop = FALSE确保您的子集不会返回向量。对于DF2，这看起来像DF2[c2.1, , drop = FALSE]$coolThings。

我个人更喜欢方法1，但两者都应该有用。

模糊匹配：避免＆＃34;错误：$运算符对原子向量无效＆＃34;

1 个答案: