截断R中某个字符的字符串

时间:2011-07-28 15:40:40

标签: string r truncate

我在R中有一个字符串列表,如下所示:

WDN.TO
WDR.N
WDS.AX
WEC.AX
WEC.N
WED.TO

我想从字符“。”开始获取字符串的所有后缀,结果应如下所示:

.TO
.N
.AX
.AX
.N
.TO

有人有什么想法吗?

3 个答案:

答案 0 :(得分:19)

约书亚的解决方案很好。我会使用sub代替gsubgsub用于替换字符串中多次出现的模式 - sub用于一次出现。模式也可以简化一下:

> x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
> sub("^[^.]*", "", x)
[1] ".TO" ".N"  ".AX" ".AX" ".N"  ".TO"

...但如果字符串与问题中的字符串一样规则,那么只需剥离前3个字符即可:

> x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
> substring(x, 4)
[1] ".TO" ".N"  ".AX" ".AX" ".N"  ".TO"

答案 1 :(得分:13)

使用gsub

x <- c("WDN.TO","WDS.N")
# replace everything from the start of the string to the "." with "."
gsub("^.*\\.",".",x)
# [1] ".TO" ".N" 

使用strsplit

# strsplit returns a list; use sapply to get the 2nd obs of each list element
y <- sapply(strsplit(x,"\\."), `[`, 2)
# since we split on ".", we need to put it back
paste(".",y,sep="")
# [1] ".TO" ".N"

答案 2 :(得分:0)

Strsplit可能会这样做,但如果数据集太大,则会显示错误 下标超出范围

x <- c("WDN.TO","WDR.N","WDS.AX","WEC.AX","WEC.N","WED.TO")
y <- strsplit(x,".")[,2]
#output y= TO N AX AX N TO