修剪R中不同的最后一个特殊字符

时间:2017-02-09 18:38:06

标签: r regex gsub stringr

下面是一个gsub方法,用于修剪数据帧中的正斜杠。希望为具有不同列数的data.frame找到更通用的解决方案。

helloToday <- data.frame(a = c("hello", "hello", "hello"), 
                 b = c("world","","world"),
                 c = c("","","today"))

helloToday
#      a     b     c
# 1 hello world      
# 2 hello            
# 3 hello world today  


# Returns the vector 
helloToday <- apply(helloToday, 1, function(x){ paste0("/", paste(x, collapse = "/")) })
# [1] "/hello/world/"      "/hello//"           "/hello/world/today"

# But I would like the trailing symbols to be trimmed off
# [1] "/hello/world"      "/hello"           "/hello/world/today"


gsub("\\/$", "", gsub("\\/$", "", helloToday))
# "/hello/world/"      "/hello//"           "/hello/world/today"

helloToday <- gsub("\\//$", "", helloToday)
helloToday <- gsub("\\/$", "", helloToday)
# "/hello/world/"      "/hello//"           "/hello/world/today"

是否有一个解决方案可以改变列数,“/”或“//”甚至“///////////”?

2 个答案:

答案 0 :(得分:3)

+是“一个或多个”的正则表达式修饰符,因此"/+$"会匹配字符串末尾的任意数量的/

gsub("/+$", "", helloToday)

答案 1 :(得分:1)

在事实之后,另一种替代方法是以不同方式构建它:

apply(helloToday, 1, function(x) do.call(file.path, as.list(x[!x %in% ''])))


## [1] "hello/world"       "hello"             "hello/world/today"

如果需要前导斜杠:

apply(helloToday, 1, function(x) do.call(file.path, as.list(c('', x[!x %in% '']))))