其中一列包含Web URL形式的数据。其中一些URL的末尾是“ /”,我需要将其删除,但不要触摸URL中其他位置可能存在的其他“ /”。
我对gsub感到厌倦,但这是一个湿滑的坡度,因为它只会去除所有的“ /”标记。我只需要删除网址末尾。
#read in IIS logfile
logfile = "u_ex190510.log"
logcols = read.table(logfile, header = FALSE,
sep = " ", skip = 3, nrows = 1, comment.char = "")
iislog = read.table(logfile, header = FALSE, sep = " ",comment.char = "#")
logcols[,1] <- NULL
names(iislog) <- unlist(logcols[1,])
View(iislog)
#rename the columns
colnames(iislog)= c('date','time','sourceIP','csMethod','csUriStem',
'csUriQuery','sourcePort','csUsername','clientIP','userAgent','csReferer',
'scStatus','scSubstatus','scWin32Status','timeTaken')
#load libraries used for date changes and sorting
library(dplyr)
library(lubridate)
#change data type for date and time columns
iislog$date <- ymd(iislog$date)
iislog$time <- hms(iislog$time)
#create subset of the original data
iislog1 <- iislog %>% select(date,time,csUriStem,timeTaken)
#ensure the csUriStem column is in all lowercase. This is because the URLs
#sem to have mixed case and therefore can show up moe than once.
iislog1$csUriStem <- tolower(iislog1$csUriStem)
iislog1
#Find unique URLs by grouping.
iislog2 <- iislog1 %>% group_by(csUriStem) %>% summarise(count=n())
#arrange the results by csUriStem. It would be nice to do this in ascending order.
iislog3 <- arrange(iislog2,desc(csUriStem), .by_group=TRUE)
iislog3
答案 0 :(得分:1)
怎么样?
my.url <- "http://test.test/test.test.html/"
gsub("/$", "", my.url)
它返回
[1] "http://test.test/test.test.html"
您可以使用$
在行尾查找元素