我是R.的新手。在我的数据集中,我有一个名为Reason的变量。我想创建一个名为Price的新列。如果满足以下任何条件:
我找到了以下用户定义的函数来获取2个单词之间的距离
distance <- function(string, term1, term2) {
words <- strsplit(string, "\\s")[[1]]
indices <- 1:length(words)
names(indices) <- words
abs(indices[term1] - indices[term2])
}
但我不知道如何将整个列应用于获得预期结果。我尝试了以下代码,但它只给了我“logical(0)”作为结果。
for (j in seq(Survey$Reason))
{
Survey$Price[[j]]<- distance(Survey$Reason[[j]], " price ", " high ") <=6
}
非常感谢任何帮助。 感谢
答案 0 :(得分:2)
从您的示例数据开始:
->
首先,我更新了你的功能以删除标点符号并直接返回你的位置测试
survey <- structure(list(Reason = c("Their price are extremely high.", "Because my price was increased so much, I wouldn't want anyone else to have to deal with that.", "Just because the intial workings were fine, but after we realised it would affect our contract, it left a sour taste in our mouth.", "Problems with the repair", "They did not handle my complaint as well I would have liked.", "Bad service overall.")), .Names = "Reason", row.names = c(NA, 6L), class = "data.frame")
然后我们申请:
distanceOK <- function(string, term1, term2,n=6) {
words <- strsplit(gsub("[[:punct:]]", "", string), "\\s")[[1]]
indices <- 1:length(words)
names(indices) <- words
dist <- abs(indices[term1] - indices[term2])
ifelse(is.na(dist)|dist>n,0,1)
}