我想基于字符串匹配创建新列。我能够创建,但它也创建了不匹配的列。举个例子:
x = data.frame(name = c("Java Hackathon", "Intro to Graphs", "Hands on
Cypher"))
toMatch <- c("Hackathon","Hands on","Test","java")
##Sentence with phrases
phrases11 <- as.vector(toMatch)
res <- sapply(phrases11, grepl, x = as.character(x$name),ignore.case=
TRUE)
rownames(res) <- x$name
#replacement
ones <- which(res==1, arr.ind=T)
res[ones]<-colnames(res)[ones[,2]]
res
Output:
Hackathon Hands on Test java
Java Hackathon "Hackathon" "FALSE" "FALSE" "java"
Intro to Graphs "FALSE" "FALSE" "FALSE" "FALSE"
Hands on Cypher "FALSE" "Hands on" "FALSE" "FALSE"
我不想要&#34;测试&#34;要创建的列,因为我有大量的匹配数据。所以基本上,我们可以在res <- sapply(phrases11, grepl, x = as.character(x$name), ignore.case = TRUE)
中进行一些代码更改,以便它只应创建我们匹配来自&#39; toMatch&#39;向量?还有其他方法吗?
答案 0 :(得分:0)
由于您使用的grepl()
函数给出了true或false,因此可以使用sum = 0来消除列:
A=sapply(toMatch,grepl,as.character(x$name),ignore.case=T)
A[,colSums(A)==1]
Hackathon Hands on java
[1,] TRUE FALSE TRUE
[2,] FALSE FALSE FALSE
[3,] FALSE TRUE FALSE