Question

我有这样的字符向量：

sampleData <- c("This is what I see i.r.o what not i.r.o",
                "Similar here a.s. also this a.s.",
                "One more i.r.o. now another i.r.o.")

我想在i.r.o或.i.r.o第一次出现后删除所有内容。但也适用于a.s或a.s.。

的情况

所以最终版本看起来像这样：

1 This is what I see i.r.o 
2 Similar here a.s. 
3 One more i.r.o.

编辑：我使用i.r.o更正了a.s和gsub()之间的差距，所以现在每个字符的表达式都相同。见上面的例子。

Answer 1

我有点困惑，因为上面的评论表明你得到了答案，但我没有看到。

这似乎有效：

sampleData <- c("This is what I see i.r.o what not i.r.o",
                "Similar here a.s. also this a.s.",
                "One more i.r.o. now another i.r.o.")
gsub("(([[:alpha:]]\\.)+[[:alpha:]][.]?) .*$","\\1",sampleData)
## [1] "This is what I see i.r.o" "Similar here a.s."       
## [3] "One more i.r.o."

正则表达式读取“'（一个或多个（字母字符后跟一个点），然后是另一个字母字符可能后跟一个点），后跟一个空格，零个或多个任何字符，后跟行的结尾';仅用括号中的（外部集合）中的内容替换引号中的内容“

缩短R中的字符长度

1 个答案: