我有10分钟的时间序列。我希望子系列的持续时间在23:10:00 - 00:00:00之间。这是数据的输入,
df<-structure(c(994, 1019, 1381, 843, 1105, 1120, 869, 2216, 1741,
1737, 1727, 1462, 1564, 418, 281, 280, 277, 311, 242, 221, 328,
359, 410, 436, 359, 1738, 2075, 1766, 1812, 1810, 1246, 323,
250, 272, 283, 286, 252, 1671, 1695, 1687, 1646, 1257, 1632,
277, 305, 292, 261, 309, 304, 209, 210, 225, 201, 197, 247, 264,
238, 260, 254, 263, 226, 624, 1955, 1561, 1231, 976, 1213, 167,
1037, 1269, 1619, 1749, 1674, 1123, 1695, 2164, 1780, 1732, 1715,
283, 230, 291, 281, 137, 1358, 1630, 1626, 1889, 1635, 1591,
1606, 2024, 1783, 1752, 613, 301, 933, 1823, 1831, 1810, 1895,
1876, 1222, 1952, 1288, 282, 261, 296, 839, 1831, 1799, 1950,
2085, 1921, 1862, 1885, 1869, 1909, 1896, 1843), .Dim = c(120L,
1L), .Dimnames = list(NULL, "value"), index = structure(c(1430764200,
1430847600, 1430848200, 1430848800, 1430849400, 1430850000, 1430850600,
1430934000, 1430934600, 1430935200, 1430935800, 1430936400, 1430937000,
1431020400, 1431021000, 1431021600, 1431022200, 1431022800, 1431023400,
1431106800, 1431107400, 1431108000, 1431108600, 1431109200, 1431109800,
1431193200, 1431193800, 1431194400, 1431195000, 1431195600, 1431196200,
1431279600, 1431280200, 1431280800, 1431281400, 1431282000, 1431282600,
1431366000, 1431366600, 1431367200, 1431367800, 1431368400, 1431369000,
1431452400, 1431453000, 1431453600, 1431454200, 1431454800, 1431455400,
1431538800, 1431539400, 1431540000, 1431540600, 1431541200, 1431541800,
1431625200, 1431625800, 1431626400, 1431627000, 1431627600, 1431628200,
1431711600, 1431712200, 1431712800, 1431713400, 1431714000, 1431714600,
1431798000, 1431798600, 1431799200, 1431799800, 1431800400, 1431801000,
1431884400, 1431885000, 1431885600, 1431886200, 1431886800, 1431887400,
1431970800, 1431971400, 1431972000, 1431972600, 1431973200, 1431973800,
1432057200, 1432057800, 1432058400, 1432059000, 1432059600, 1432060200,
1432143600, 1432144200, 1432144800, 1432145400, 1432146000, 1432146600,
1432230000, 1432230600, 1432231200, 1432231800, 1432232400, 1432233000,
1432316400, 1432317000, 1432317600, 1432318200, 1432318800, 1432319400,
1432402800, 1432403400, 1432404000, 1432404600, 1432405200, 1432405800,
1432489200, 1432489800, 1432490400, 1432491000, 1432491600), tclass = c("POSIXct",
"POSIXt"), tzone = "Asia/Kolkata"), .indexCLASS = c("POSIXct",
"POSIXt"), .indexTZ = "Asia/Kolkata", tclass = c("POSIXct", "POSIXt"
), tzone = "Asia/Kolkata", class = c("xts", "zoo"))
是否有任何现有的功能可以做到这一点?我试过split.xts
,但无法获得所需的表格。
答案 0 :(得分:1)
您可以仅将dplyr
与基础R一起使用,或将tidyr
和unstack
与链式表达一起使用。 Base R的tidyr's
和spread
# base R version
library(xts)
df2 <- unstack(data.frame(value=coredata(df), time = format(index(df), "%H:%M")),
value ~ time)[,c(2:6,1)]
# version using chained expressions with dplyr and tidyr
library(xts)
library(dplyr)
library(tidyr)
df3 <- df %>% fortify.zoo() %>%
mutate(time=format(Index, "%H:%M"), Index=format(Index, "%Y-%m-%d") ) %>%
spread(key=time, value=value) %>%
select(c(3:6,2))
都包含两列包含键值对的数据,并将它们排列为每个唯一键值的单独值列。代码看起来像:
{{1}}