如果有向量,如何在R中使用子字符串函数?

时间:2017-01-07 14:30:12

标签: r string vector integer substring

我需要使用substring函数从数据框中按位置提取字符,如图所示:

enter image description here

以下是我使用的代码:

substring(df$Text1,
          df$'Location of Different Letters',
          df$'Location of Different Letters')

substring函数在数字为字符串格式的每一行中引入NA个。有什么建议如何使其工作?在列#34上使用as.integer;不同字母的位置"因c():

而无法正常工作

2 个答案:

答案 0 :(得分:2)

你有Location of different letters作为字符列,这会让事情变得有些丑陋,因为我们必须使用eval(parse(..))

## create a index list
cmd <- paste0("list(", toString(df$"Location of different letters"), ")")
# [1] "list(4, c(1,6,7,8), 3:6)"
ind <- eval(parse(text = cmd))
## split your strings
s <- strsplit(df$Text1, "")
## use `mapply`
set1 <- mapply("[", s, ind)

## now compare with `Text2` to get different letters
set2 <- strsplit(df$Text2, "")
mapply(function (a, b) paste0(setdiff(a, b), collapse = ""), set1, set2)
# [1] "d"    "FADX" "123" 

数据:

df <- data.frame(Text1 = c("abcd", "FxyznADX", "Don123"),
                 Text2 = c("abc", "xyzn", "Don"),
                 "Location of different letters" = c("4", "c(1,6,7,8)", "3:6"),
                 check.names = FALSE)

答案 1 :(得分:1)

如果您的Location of different letters列中包含值向量,则此方法有效。

out <- sapply(c(1, 6, 7, 8), FUN = function(x) substring("FxyznADX", first = x, last = x))

do.call(paste, args = list(as.list(out), collapse = ""))
[1] "FADX"

如果您有值的字符/因子,则可能需要求助eval(parse(...))

sapply(eval(parse(text = "c(1, 6, 7, 8)")), FUN = function(x) substring("FxyznADX", first = x, last = x))

[1] "F" "A" "D" "X"