我有以下文本文件
[01/29/14 16:42:55, 10.100.120.120, unknown]: spatial_monitor: Alan entered Conference Room (Zone Role contains Person role)
[01/29/14 16:42:57, 10.100.120.120, unknown]: spatial_monitor: Alan left Conference Room (Zone Role contains Person role)
[01/29/14 16:43:00, 10.100.120.120, unknown]: spatial_monitor: Kurt entered Conference Room (Computer desk contains Person role)
[01/29/14 16:43:02, 10.100.120.120, unknown]: spatial_monitor: Kurt left Conference Room (Computer desk contains Person role)
[01/29/14 16:43:03, 10.100.120.120, unknown]: spatial_monitor: Alan entered Conference Room (Zone Role contains Person role)
[01/29/14 16:43:08, 10.100.120.120, unknown]: spatial_monitor: Alan left Conference Room (Zone Role contains Person role)
[01/29/14 16:46:07, 10.100.120.120, unknown]: spatial_monitor: Fred entered Conference Room (Zone Role contains Person role)
[01/29/14 16:46:08, 10.100.120.120, unknown]: spatial_monitor: Fred left Conference Room (Zone Role contains Person role)
我正在尝试在R中使用str_extract(在库stringr中)来提取位置的名称(上例中的“会议室”)。逻辑是拉出“输入”或“左”之后的字符串部分。为此,我有以下正则表达式
(?<=entered\s)[A-Z][a-z]+\s[A-Z][a-z]+
这在Notepad ++中工作正常,但是当我将其嵌入R中时,我收到以下错误
> tt <- "[01/29/14 16:42:55, 10.100.120.120, unknown]: spatial_monitor: Alan entered Conference Room (Zone Role contains Person role)"
> str_extract(tt, '(?<=entered\\s)[A-Z][a-z]+\\s[A-Z][a-z]+')
Error in regexpr("(?<=entered\\s)[A-Z][a-z]+\\s[A-Z][a-z]+", "[01/29/14 16:42:55, 10.100.120.120, unknown]: spatial_monitor: Alan entered Conference Room (Zone Role contains Person role)", :
invalid regular expression '(?<=entered\s)[A-Z][a-z]+\s[A-Z][a-z]+', reason 'Invalid regexp'
其他答案告诉我lookahead and lookbehind only work with Perl。那么问题是如何使用str_extract启用Perl?或者有更好的方法吗?提前谢谢。
答案 0 :(得分:2)
您的正则表达式 有效。如果您指定sub
,它适用于perl = TRUE
。您还可以使用sub
功能完成任务:
sub('.*(?<=entered\\s)([A-Z][a-z]+\\s[A-Z][a-z]+).*', '\\1', tt, perl = TRUE)
# [1] "Conference Room"
或者,没有perl
:
sub('.*entered\\s([A-Z][a-z]+\\s[A-Z][a-z]+).*', '\\1', tt)
# [1] "Conference Room"
答案 1 :(得分:1)
library(stringr)
tt <- "[01/29/14 16:42:55, 10.100.120.120, unknown]: spatial_monitor: Alan entered Conference Room (Zone Role contains Person role)"
str_extract(tt, perl('(?<=entered\\s)[A-Z][a-z]+\\s[A-Z][a-z]+'))
# [1] "Conference Room"