R - 使用字符串方法转换日期格式

时间:2014-04-13 19:15:44

标签: string r substring string-parsing

我目前有一个包含15个变量和大约300万行的数据框。

其中一列是日期列,格式为yyyymmdd,如果yyyymm01dd>=1,我的目标是将该字符串重新格式化为<=14 }和yyyymm02否则。

当我运行我的代码时,我得到了

Error in 1:end : NA/NaN argument

我不太清楚为什么。我的代码如下。

for(i in 1:end)
{
technical.montday[i] = substr(toString(technical$datadate[i]), start = 1, stop = 6)
technical$datadate[i] =  ifelse((as.integer(substr(toString(technical$datadate[i]),start =     7, stop = 8)) >= 1) && (as.integer(substr(toString(technical$datadate[i]),start = 7, stop =  8))<=14),paste(technical.montday,"01", sep=""), paste(technical.montday,"15", sep="") )
}

2 个答案:

答案 0 :(得分:0)

  

其中一列是日期列,格式为yyyymmdd和我的目标   如果dd是&gt; = 1并且&lt; = 14并且将该字符串重新格式化为yyyymm01   yyyymm02否则。

我不明白你的代码,但你说的可以做到,例如像这样:

# suppose DATE is the date column
dd <- as.integer(substr(DATE, 7,8))
DATE <- paste0(substr(DATE, 1, 6), ifelse(dd<=14 & dd>=1, "01", "02")

ifelse部分可能缩短为ifelse(dd<=14, "01", "02")。如果您需要DATE为数字,请添加as.numericas.integer

(编辑)

使用子字符串替换可能更有效:

DATE <- as.character(DATE)
substr(DATE, 7,8) <- ifelse(substr(DATE, 7,8) > 14, "02", "01")

(注意substr(DATE,7,8)被隐式转换为数字。)它有效:

> DATE <- as.character(20140401:20140430)
> substr(DATE, 7,8) <- ifelse(substr(DATE, 7,8) > 14, "02", "01")
> DATE
 [1] "20140401" "20140401" "20140401" "20140401" "20140401" "20140401"
 [7] "20140401" "20140401" "20140401" "20140401" "20140401" "20140401"
[13] "20140401" "20140401" "20140402" "20140402" "20140402" "20140402"
[19] "20140402" "20140402" "20140402" "20140402" "20140402" "20140402"
[25] "20140402" "20140402" "20140402" "20140402" "20140402" "20140402"

答案 1 :(得分:0)

也许采取不同的方法:

technical <- data.frame(datadate = c("20140101", "20140203", "20131216", "20131130"), 
    stringsAsFactors = FALSE)

print(technical$datadate)
## [1] "20140101" "20140203" "20131216" "20131130"

technical$datadate <- sapply(technical$datadate, function(x) {

    year.mon <- substr(x, 1, 6)
    dd <- as.numeric(substr(x, 7, 8))

    return(paste(year.mon, ifelse((dd > 14), "02", "01"), sep = "", collapse = ""))

})

print(technical$datadate)
## [1] "20140101" "20140201" "20131202" "20131102"

注意:paste0可能更快,这对您的情况可能很重要。出于这样的原因,我也去了sapply