假设我有一个字符串:
“Region/Country/Industry/Product”
我只想提取第n个和第m个单斜杠之间的字符。是否存在使用现有功能的单线功能,我们可以用来做到这一点?
例如,如果要获取以下字符向量中的条目的第二和第三斜杠之间的字符串:
c(“EMEA/Germany/Automotive/Mercedes”, “APAC/SouthKorea/Technology/Samsung”,
“AMER/US/Wireless/Verizon”)
具有此类功能的输出为:
c(“Automotive”,”Technology”,”Wireless”).
答案 0 :(得分:4)
我们可以使用sub
捕获最后一个/
之前的单词,在替换中,指定捕获组的后向引用(\\1
)
sub(".*[/](\\w+)[/]\\w+$", "\\1", str1)
#[1] "Automotive" "Technology" "Wireless"
或者另一个变化是
sub("^([^/]+[/]){2}([^/]+).*", "\\2", str1)
#[1] "Automotive" "Technology" "Wireless"
或在定界符/
处分割字符串并提取单词
sapply(strsplit(str1, "/"), `[`, 3)
#[1] "Automotive" "Technology" "Wireless"
str1 <- c("EMEA/Germany/Automotive/Mercedes",
"APAC/SouthKorea/Technology/Samsung", "AMER/US/Wireless/Verizon")
答案 1 :(得分:2)
当然是stringr
解决方案,
library(stringr)
word(x, 3, sep = '/')
#[1] "Automotive" "Technology" "Wireless"
答案 2 :(得分:1)
您还可以像下面那样使用函数strsplit
并定制位置
x <- c("EMEA/Germany/Automotive/Mercedes", "APAC/SouthKorea/Technology/Samsung", "AMER/US/Wireless/Verizon")
sapply(x, FUN = function(x) {
y <- unlist(strsplit(x, split="/"))
y[3] # This line can be customised depending the position of the word
}
)
# "Automotive" "Technology" "Wireless"
答案 3 :(得分:0)
您还可以删除不需要的部分:
strings <- c("EMEA/Germany/Automotive/Mercedes", "APAC/SouthKorea/Technology/Samsung","AMER/US/Wireless/Verizon")
gsub("^([^/]*/){2}|/[^/]*$","",strings)
#[1] "Automotive" "Technology" "Wireless"