在R中排序字符向量

时间:2015-01-26 15:04:54

标签: r

我有这个向量,我需要按降序排序。最新的txt首先是:

d<-c("/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_04_2015.txt","/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_11_2015.txt","/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_18_2015.txt","/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_08-01_25_2015.txt","/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-11_25-01_20_2015.txt")

当我这样做时:

d <- sort(d)

d[1]
# "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_04_2015.txt"

需要这样:

"/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_08-01_25_2015.txt"

我应该可以通过文本&#34; 11_25-01_20_2015&#34;中的此条目对此进行排序,其中11为小时,25分钟,01个月, 20天和2015年,即hour_minute-month_day_year。

我怎么能这样做?

4 个答案:

答案 0 :(得分:5)

如果字符串的结尾是一致的(bla-bla-bla- time-date.txt ),您可以使用substring来提取时间。然后将时间转换为as.POSIXct并在order

中使用它们
time <- substring(d, first = nchar(d)-19)
d[order(as.POSIXct(time, format = "%H_%M-%m_%d_%Y.txt"), decreasing = TRUE)]
# [1] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_08-01_25_2015.txt"
# [2] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-11_25-01_20_2015.txt"
# [3] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_18_2015.txt"
# [4] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_11_2015.txt"
# [5] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_04_2015.txt"

答案 1 :(得分:3)

首先,您应该提取时间并将其置于合理的格式中:

times <- as.POSIXct(sub("^.+Report-([0-9]+)_([0-9]+)-([0-9]+)_([0-9]+)_([0-9]+)\\.txt$","\\5-\\3-\\4 \\1:\\2",d))
times
[1] "2015-01-04 02:06:00 GMT" "2015-01-11 02:06:00 GMT"
[3] "2015-01-18 02:06:00 GMT" "2015-01-25 02:08:00 GMT"
[5] "2015-01-20 11:25:00 GMT"

然后您可以使用这些来订购原始数据:

d[order(times, decreasing=TRUE)][1]
[1] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_08-01_25_2015.txt"

答案 2 :(得分:2)

试试这个:

# trim everythin before the string 'Report-'
dateSting <- gsub('^.*Report-','',d )
# trim the '.txt' from the end.
dateSting <- gsub('\\.txt$','',dateSting )
#convert the date string to a date-time object
dateTime  <-  as.POSIXct(dateSting,'%H_%M-%m_%d_%Y')
# sort on date time 
d <- d[order(dateTime)]

答案 3 :(得分:2)

您可以提取日期,转换为POSIXct课程,然后使用which.max

获取最新日期
library(stringi)
indx <- as.POSIXct(stri_extract_first_regex(d, "(?<=Report-).*(?=\\.txt)"), format = "%H_%M-%m_%d_%Y")
d[which.max(indx)]
# [1] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_08-01_25_2015.txt"

或者您可以按递减顺序排序

d[order(indx, decreasing = TRUE)]
# [1] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_08-01_25_2015.txt"
# [2] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-11_25-01_20_2015.txt"
# [3] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_18_2015.txt"
# [4] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_11_2015.txt"
# [5] "/SiteScope/accounts/login59/htdocs/Reports-1722992141/Report-02_06-01_04_2015.txt"