我目前有一个包含15个变量和大约300万行的数据框。
其中一列是日期列,格式为yyyymmdd
,如果yyyymm01
为dd
且>=1
,我的目标是将该字符串重新格式化为<=14
}和yyyymm02
否则。
当我运行我的代码时,我得到了
Error in 1:end : NA/NaN argument
我不太清楚为什么。我的代码如下。
for(i in 1:end)
{
technical.montday[i] = substr(toString(technical$datadate[i]), start = 1, stop = 6)
technical$datadate[i] = ifelse((as.integer(substr(toString(technical$datadate[i]),start = 7, stop = 8)) >= 1) && (as.integer(substr(toString(technical$datadate[i]),start = 7, stop = 8))<=14),paste(technical.montday,"01", sep=""), paste(technical.montday,"15", sep="") )
}
答案 0 :(得分:0)
其中一列是日期列,格式为yyyymmdd和我的目标 如果dd是&gt; = 1并且&lt; = 14并且将该字符串重新格式化为yyyymm01 yyyymm02否则。
我不明白你的代码,但你说的可以做到,例如像这样:
# suppose DATE is the date column
dd <- as.integer(substr(DATE, 7,8))
DATE <- paste0(substr(DATE, 1, 6), ifelse(dd<=14 & dd>=1, "01", "02")
ifelse
部分可能缩短为ifelse(dd<=14, "01", "02")
。如果您需要DATE为数字,请添加as.numeric
或as.integer
。
使用子字符串替换可能更有效:
DATE <- as.character(DATE)
substr(DATE, 7,8) <- ifelse(substr(DATE, 7,8) > 14, "02", "01")
(注意substr(DATE,7,8)被隐式转换为数字。)它有效:
> DATE <- as.character(20140401:20140430)
> substr(DATE, 7,8) <- ifelse(substr(DATE, 7,8) > 14, "02", "01")
> DATE
[1] "20140401" "20140401" "20140401" "20140401" "20140401" "20140401"
[7] "20140401" "20140401" "20140401" "20140401" "20140401" "20140401"
[13] "20140401" "20140401" "20140402" "20140402" "20140402" "20140402"
[19] "20140402" "20140402" "20140402" "20140402" "20140402" "20140402"
[25] "20140402" "20140402" "20140402" "20140402" "20140402" "20140402"
答案 1 :(得分:0)
也许采取不同的方法:
technical <- data.frame(datadate = c("20140101", "20140203", "20131216", "20131130"),
stringsAsFactors = FALSE)
print(technical$datadate)
## [1] "20140101" "20140203" "20131216" "20131130"
technical$datadate <- sapply(technical$datadate, function(x) {
year.mon <- substr(x, 1, 6)
dd <- as.numeric(substr(x, 7, 8))
return(paste(year.mon, ifelse((dd > 14), "02", "01"), sep = "", collapse = ""))
})
print(technical$datadate)
## [1] "20140101" "20140201" "20131202" "20131102"
注意:paste0
可能更快,这对您的情况可能很重要。出于这样的原因,我也去了sapply
。