使用R匹配后提取不同的字符串

时间:2019-02-24 09:26:09

标签: r string

data <- c("Demand =  001   979", "Demand =  -08   976 (154)", "Demand =  -01   975 (359)")
data <- str_match(data, pattern = ("Demand = (.*) (.*)"))

我需要使用str_match将前2组数字(包括-号)提取到列中。
排除括号()中的第三组数字。
欢迎任何帮助。

输出

## [1] "001" "-08" "-01"
## [2] "979" "976" "975"

2 个答案:

答案 0 :(得分:1)

如何删除其他所有内容?

data <- c("Demand = 001 979", "Demand = -08 976 (154)", "Demand = -01 975 (359)")
data <- gsub("Demand = ", "", x = data)
data <- trimws(gsub("\\(.*\\)", "", x = data))

out <- list()

out[[1]] <- sapply(data, "[", 1)
out[[2]] <- sapply(data, "[", 2)
out

[[1]]
[1] "001" "-08" "-01"

[[2]]
[1] "979" "976" "975"

答案 1 :(得分:0)

str_extract_all()中有stringr的可能性:

sapply(str_extract_all(x, "-?[0-9]+?[0-9]*"), function(x) x[1])

[1] "001" "-08" "-01"

sapply(str_extract_all(x, "-?[0-9]+?[0-9]*"), function(x) x[2])

[1] "979" "976" "975"

或者在strsplit()中使用@RomanLuštrik的想法:

sapply(strsplit(gsub("Demand = ", "", x), " "), function(x) x[1])

[1] "001" "-08" "-01"