我有一个字符向量,表示未格式化日期的覆盖年份,它是这样的:
Period of coverage
1 1/1/2011 to 31/12/2011
2 1/1/2010 to 31/12/2010
3 1/1/2012 to 31/12/2012
4 1/1/2010 to 31/12/2010
5 1/1/2011 to 31/12/2011
6 1/1/2012 to 31/12/2012
7 1/1/2010 to 31/12/2010
8 1/1/2010 to 31/12/2010
9 1/1/2009 to 31/12/2009
我想知道如何将列转换为每个观察所代表的年份。每一行都有相同的开始日期和结束日期(1/1和31/12)。
答案 0 :(得分:1)
假设您的数据存储在变量WITH test (id, start_at, place_id, recurring_schedule) AS (
VALUES
(358, '2015-01-23 20:00:00 +0000'::TIMESTAMPTZ, 412,
'{"validations":{"day":[2]},"rule_type":"IceCube::WeeklyRule","interval":1,"week_start":0}'::JSONB),
(359, '2016-01-22 19:30:00 +1100', 414,
'{"validations":{"day":[1]},"rule_type":"IceCube::WeeklyRule","interval":1,"week_start":0}'),
(360, '2016-02-01 19:00:00 +1100', 415,
'{"validations":{"day":[4]},"rule_type":"IceCube::WeeklyRule","interval":1,"week_start":0}'),
(361, '2016-02-01 20:00:00 +0000', 416,
'{"validations":{"day":[4]},"rule_type":"IceCube::WeeklyRule","interval":1,"week_start":0}'),
(362, '2014-02-13 20:00:00 +0000', 417,
'{"validations":{"day":[2]},"rule_type":"IceCube::WeeklyRule","interval":1,"week_start":0}')
)
SELECT id, start_at, place_id,
CASE recurring_schedule->>'rule_type'
WHEN 'IceCube::WeeklyRule'
THEN GENERATE_SERIES(start_at, NOW(), (recurring_schedule->>'interval' || ' WEEK')::INTERVAL)
ELSE NULL
END recurring_start_time
FROM test;
中,并且所有日期的格式都没有改变,如您所述,
period
答案 1 :(得分:1)
假设最后在Note中重复显示DF
,删除最后一个斜杠的所有内容并转换为数字:
transform(DF, year = as.numeric(sub(".*/", "", `Period of coverage`)), check.names = FALSE)
,并提供:
Period of coverage year
1 1/1/2011 to 31/12/2011 2011
2 1/1/2010 to 31/12/2010 2010
3 1/1/2012 to 31/12/2012 2012
4 1/1/2010 to 31/12/2010 2010
5 1/1/2011 to 31/12/2011 2011
6 1/1/2012 to 31/12/2012 2012
7 1/1/2010 to 31/12/2010 2010
8 1/1/2010 to 31/12/2010 2010
9 1/1/2009 to 31/12/2009 2009
另一种可能性是首先将它转换为Date类,注意as.Date
在最后忽略垃圾:
to_year <- function(x, fmt) as.numeric(format(as.Date(x, fmt), "%Y"))
transform(DF, year = to_year(`Period of coverage`, "%d/%m/%Y"), check.names = FALSE)
Lines <- " Period of coverage
1/1/2011 to 31/12/2011
1/1/2010 to 31/12/2010
1/1/2012 to 31/12/2012
1/1/2010 to 31/12/2010
1/1/2011 to 31/12/2011
1/1/2012 to 31/12/2012
1/1/2010 to 31/12/2010
1/1/2010 to 31/12/2010
1/1/2009 to 31/12/2009"
DF <- read.csv(text = Lines, check.names = FALSE, as.is = TRUE)
答案 2 :(得分:1)
如果您的字符串始终具有相同的格式,您只需使用子字符串并将其转换为日期:
as.Date(substr("1/1/2011 to 31/12/2011",5,8), format="%Y")
as.Date(substr("1/1/2011 to 31/12/2011",19,23), format="%Y")
如果字符串变量更大但总是被“to”拆分,则可以使用stringsplit取消列出字符串,然后将其格式化为年份:
a <- "1/1/2011 to 31/12/2011"
a2 <- strsplit(a, "to") ;
a3 <- unlist(a2) ;
a4 <- as.Date(a3, format="%d/%m/%Y")
year = format(a4, format="%Y")