我有yyyy-ww
形式的周日数据,其中ww
是两位数的周数。数据范围为2007-01
到2010-30
。周计数会议是ISO 8601,正如您在此处所见on Wikipedia's "Week number" article,偶尔会在一年内达到53周。例如2009年该系统有53周,请参阅this ISO 8601 calendar中的周数。 (参见其他年份;根据维基百科的文章,第53周相当罕见。)
基本上我想阅读周日期,将其转换为Date
对象并将其保存到data.frame
中的单独列。作为测试,我将Date
对象重新转换为yyyy-ww
格式format([Date-object], format = "%Y-%W"
,这会在2009-53
处引发错误。该周未被R
解释为日期。这是非常奇怪的,因为其他年份不有第53周(ISO 8601标准)被转换为罚款,例如2007-53
,而其他年份也没有53岁周(在ISO 8601标准中)也失败,例如2008-53
以下最小示例演示了此问题。
最小例子:
dates <- c("2009-50", "2009-51", "2009-52", "2009-53", "2010-01", "2010-02")
as.Date(x = paste(dates, 1), format = "%Y-%W %w")
# [1] "2009-12-14" "2009-12-21" "2009-12-28" NA "2010-01-04"
# [6] "2010-01-11"
other.dates <- c("2007-53", "2008-53", "2009-53", "2010-53")
as.Date(x = paste(other.dates, 1), format = "%Y-%W %w")
# [1] "2007-12-31" NA NA NA
问题是,如何让R
接受ISO 8601格式的周数?
注意:这个问题总结了我几个小时以来一直在努力解决的问题。我搜索并找到了各种有用的帖子,例如this,但都没有解决问题。
答案 0 :(得分:11)
包ISOweek
管理ISO 8601样式周编号,转换为Date
中的R
个对象。有关详情,请参阅ISOweek
。继续上面的示例日期,我们首先需要稍微修改格式。它们必须采用yyyy-Www-w
而不是yyyy-ww
,即2009-W53-1
。最后一位数字标识用于识别周的星期几,在这种情况下,它是星期一。周数必须是两位数。
library(ISOweek)
dates <- c("2009-50", "2009-51", "2009-52", "2009-53", "2010-01", "2010-02")
other.dates <- c("2007-53", "2008-53", "2009-53", "2010-53")
dates <- sub("(\\d{4}-)(\\d{2})", "\\1W\\2-1", dates)
other.dates <- sub("(\\d{4}-)(\\d{2})", "\\1W\\2-1", other.dates)
## Check:
dates
# [1] "2009-W50-1" "2009-W51-1" "2009-W52-1" "2009-W53-1" "2010-W01-1"
# [6] "2010-W02-1"
(iso.date <- ISOweek2date(dates)) # deal correctly
# [1] "2009-12-07" "2009-12-14" "2009-12-21" "2009-12-28" "2010-01-04"
# [6] "2010-01-11"
(iso.other.date <- ISOweek2date(other.dates)) # also deals with this
# [1] "2007-12-31" "2008-12-29" "2009-12-28" "2011-01-03"
## Check that back-conversion works:
all(date2ISOweek(iso.date) == dates)
# [1] TRUE
## This does not work for the others, since the 53rd week of
## e.g. 2008 is back-converted to the first week of 2009, in
## line with the ISO 6801 standard.
date2ISOweek(iso.other.date) == other.dates
# [1] FALSE FALSE TRUE FALSE