R在分隔符的第n个和第i个实例之间提取字符串

时间:2017-07-05 19:41:45

标签: r

我有一个字符串向量,与此类似,但有更多元素:

s <- c("CGA-DV-558_T_90.67.0_DV_1541_07", "TC-V-576_T_90.0_DV_151_0", "TCA-DV-X_T_6.0_D_A2_07", "T-V-Z_T_2_D_A_0", "CGA-DV-AW0_T.1_24.4.0_V_A6_7", "ACGA-DV-A4W0_T_274.46.0_DV_A266_07")

我想使用一个函数来提取分隔符的第n个和第i个实例之间的字符串&#34; _&#34;。例如,第二个(n = 2)和第三个(i = 3)实例之间的字符串,以获取此信息:

[1] "90.67.0"  "90.0"     "6.0"      "2"        "24.4.0"   "274.46.0"

或者,如果n = 4且i = 5&#34;

[1] "1541" "151"  "A2"   "A"    "A"    "A266"

有什么建议吗?谢谢你的帮助!

3 个答案:

答案 0 :(得分:3)

您可以使用gsub

执行此操作
n = 2
i = 3

pattern1 = paste0("(.*?_){", n,  "}")
temp = gsub(pattern1, "", s)
pattern2 = paste0("((.*?_){", i-n,  "}).*")
temp = gsub(pattern2, "\\1", temp)
temp = gsub("_$", "", temp)
[1] "1541" "151"  "A2"   "A"    "A6"   "A266"

答案 1 :(得分:3)

#FUNCTION
foo = function(x, n, i){
    do.call(c, lapply(x, function(X)
        paste(unlist(strsplit(X, "_"))[(n+1):(i)], collapse = "_")))
}

#USAGE
foo(x = s, n = 3, i = 5)
#[1] "DV_1541" "DV_151"  "D_A2"    "D_A"     "V_A6"    "DV_A266"

答案 2 :(得分:2)

第三种方法,使用<html> <head> <style> .astext { background:none; border:none; margin:0; padding:0; } </style> </head> <body> <button class="astext"><span>Button 1</span></button> <button class="astext"><span>Button 2</span></button> <button class="astext"><span>Button 3</span></button> <button class="astext"><span>Button 4</span></button> </body> </html> 进行提取,substring查找位置

gregexpr