删除结尾的“ /'

时间:2019-05-14 13:37:42

标签: r

其中一列包含Web URL形式的数据。其中一些URL的末尾是“ /”,我需要将其删除,但不要触摸URL中其他位置可能存在的其他“ /”。

我对gsub感到厌倦,但这是一个湿滑的坡度,因为它只会去除所有的“ /”标记。我只需要删除网址末尾。

#read in IIS logfile
logfile = "u_ex190510.log"
logcols = read.table(logfile, header = FALSE, 
                      sep = " ", skip = 3, nrows = 1, comment.char = "")
iislog = read.table(logfile, header = FALSE, sep = " ",comment.char = "#")
logcols[,1] <- NULL
names(iislog) <- unlist(logcols[1,])
View(iislog)
#rename the columns
colnames(iislog)= c('date','time','sourceIP','csMethod','csUriStem',
                    'csUriQuery','sourcePort','csUsername','clientIP','userAgent','csReferer',
                    'scStatus','scSubstatus','scWin32Status','timeTaken')
#load libraries used for date changes and sorting
library(dplyr)
library(lubridate)
#change data type for date and time columns
iislog$date <- ymd(iislog$date)
iislog$time <- hms(iislog$time)
#create subset of the original data
iislog1 <- iislog %>% select(date,time,csUriStem,timeTaken)
#ensure the csUriStem column is in all lowercase. This is because the URLs
#sem to have mixed case and therefore can show up moe than once.
iislog1$csUriStem <- tolower(iislog1$csUriStem)
iislog1
#Find unique URLs by grouping.
iislog2 <- iislog1 %>% group_by(csUriStem) %>% summarise(count=n())
#arrange the results by csUriStem. It would be nice to do this in ascending order.
iislog3 <- arrange(iislog2,desc(csUriStem), .by_group=TRUE)
iislog3

1 个答案:

答案 0 :(得分:1)

怎么样?

my.url <- "http://test.test/test.test.html/"
gsub("/$", "", my.url)

它返回

[1] "http://test.test/test.test.html"

您可以使用$在行尾查找元素