我想从弦中提取血压。数据可能如下所示:
text <- c("at 10.00 seated 132/69", "99/49", "176/109",
"10.12 I 128/51, II 149/51 h.9.16", "153/82 p.90 ja 154/81 p.86",
"h:17.45", "not measured", "time 7.30 RR 202/97 p. 69")
我想提取模式&#34;数字/数字&#34; (即"132/69"
)。在上面的例子中,预期的输出是一个列表:
[[1]]
[1] "132/69"
[[2]]
[1] "99/49"
[[3]]
[1] "176/109"
[[4]]
[1] "128/51" "149/51"
[[5]]
[1] "153/82" "154/81"
[[6]]
[1] ""
[[7]]
[1] ""
[[8]]
[1] "202/97"
我最接近的解决方案:
gsub( "^.*([0-9]{3}/[0-9]+).*","\\1", text)
不幸的是,在我的解决方案中,它不返回该模式的所有匹配情况,并且还返回一个根本没有所需模式的字符串。
答案 0 :(得分:3)
regmatches(text, gregexpr("\\d+/\\d+", text))
#[[1]]
#[1] "132/69"
#
#[[2]]
#[1] "99/49"
#
#[[3]]
#[1] "176/109"
#
#[[4]]
#[1] "128/51" "149/51"
#
#[[5]]
#[1] "153/82" "154/81"
#
#[[6]]
#character(0)
#
#[[7]]
#character(0)
#
#[[8]]
#[1] "202/97"
答案 1 :(得分:1)
如果您想获得所描述的确切输出,可以使用
library(stringr)
library(magrittr)
text <- c("at 10.00 seated 132/69", "99/49", "176/109",
"10.12 I 128/51, II 149/51 h.9.16", "153/82 p.90 ja 154/81 p.86",
"h:17.45", "not measured", "time 7.30 RR 202/97 p. 69")
str_extract_all(text, "\\d{2,3}/\\d{1,3}") %>%
lapply(FUN = function(x) if (length(x) == 0) "" else x)
[[1]]
[1] "132/69"
[[2]]
[1] "99/49"
[[3]]
[1] "176/109"
[[4]]
[1] "128/51" "149/51"
[[5]]
[1] "153/82" "154/81"
[[6]]
[1] ""
[[7]]
[1] ""
[[8]]
[1] "202/97"
如果你想留在R基地,你也可以使用Roland的regmatches
。
答案 2 :(得分:1)
轻微&amp; @ Benjamin的解决方案更紧凑,返回一个漂亮的,简单的字符向量,并且无需处理@ Roland列表中的0长度元素:
library(stringi)
library(purrr)
txt <- c("at 10.00 seated 132/69", "99/49", "176/109",
"10.12 I 128/51, II 149/51 h.9.16", "153/82 p.90 ja 154/81 p.86",
"h:17.45", "not measured", "time 7.30 RR 202/97 p. 69")
stri_match_all_regex(txt, "\\d{2,3}/\\d{1,3}") %>%
flatten_chr() %>%
discard(is.na)
## [1] "132/69" "99/49" "176/109" "128/51" "149/51" "153/82" "154/81" "202/97"