Question

我试图多次重复两个循环，直到满足条件

第一个循环是从字符串中删除关键字（模式）的第一个实例。第二个循环是计算同一个字符串中关键字的实例数。

数据位于包含3列的数据框中 - 字符串，关键字和字符串中重复关键字的次数（NoKW）

string <- c (" temple san temple lush ", " mohito sudoku war ", " martyr  martyr metal martyr", " jump statement statement ", " window capsule turn ")
keyword <- c (" temple ", " sudoku ", " martyr " , " statement ", " capsule ")
NoKW <- c(2,1,3,2,1)
data <- data.frame (string, keyword, NoKW)
data$string <- as.character(data$string)
data$keyword <- as.character(data$keyword)

这个想法是按顺序删除关键字的实例，直到我在相应的字符串中只有一个关键字实例。

我尝试使用如下重复。

repeat
{
M <- nrow(data);
for (j in 1:M){
if(1 < data[j,3]) data[j,1] <- str_replace(data[j,1], data[j,2], " ")
};
for (i in 1:M){
data[i,3] <- sum(str_count(data[i,1], data[i,2]))
};
max <- as.numeric(max(data$NoKW));
if (max = 1)
break;
}

但它会出现以下错误

Error: unexpected '=' in:
"    };
if (max ="
>         break;
Error: no loop for break/next, jumping to top level
> }
Error: unexpected '}' in "}"
>

我是R圈的新手所以你能告诉我哪里出错了。

Answer 1

我的想法是按顺序删除关键字实例相应字符串中只有一个关键字实例。

您不需要for循环：

#split your strings by space
substrings <- strsplit(string, " ", fixed=TRUE)

#remove spaces from keywords
keyword_clean <- gsub("\\s", "", keyword)

#loop over the strings
sapply(substrings, function(s) {
  #which word is duplicated and a keyword
  rm <- which(duplicated(s, fromLast=TRUE) & s %in% keyword_clean)  
  #remove the duplicated keywords
  if (length(rm > 0)) s <- s[-rm]
  #paste words together
  paste(s, collapse=" ")
  })

#[1] " san temple lush"     " mohito sudoku war"   "  metal martyr"       " jump statement"      " window capsule turn"

在R中重复两个循环直到满足条件

1 个答案: