Question

我有这个df，其中包含日期和时间，日期和时间的列。当然还有CH4的观察结果和计算出的比率（我有更多，但这与这个问题无关）。

'data.frame':   1420847 obs. of  17 variables
$ Start     : Factor w/ 1469 levels "2013-08-31 23:56:09.000",..: 2 2 2 2 2 2 2 2 2 2 ...  
$ CO2       : int  1510 1950 1190 1170 780 870 730 740 680 700 ...
$ CH4       : int  66 77 62 58 34 51 36 43 32 40 ...
$ Ratio     : num  0.0437 0.0395 0.0521 0.0496 0.0436 ...  
$ Start_time: POSIXlt, format: "2013-11-20 00:10:05" "2013-11-20 00:10:05" "2013-11-20 00:10:05" "2013-11-20 00:10:05" ...  
$ Start_date: Date, format: "2013-09-01" "2013-09-01" "2013-09-01" "2013-09-01" ...

现在，我希望每天分成6个街区，每个街区，每个街区分配1到6个数字。然而，问题是我只有测量开始的日期和时间（Start_date和Start_time，或合并的Start），所以我认为有必要分配每个新Start_time到block。观察的长度变化很大，因此没有为其分配数字的选项。这就是我希望实现的目标：

                  Start  Start_time    Start_date   CO2 CH4       Ratio  block
2013-09-01 00:10:05.000    00:10:05    2013-09-01  1510  66  0.04370861      1
2013-09-01 00:10:05.000    00:10:05    2013-09-01  1950  77  0.03948718      1
2013-09-01 05:16:55.000    05:16:55    2013-09-01  1190  62  0.05210084      2
2013-09-01 05:16:55.000    05:16:55    2013-09-01  1170  58  0.04957265      2
2013-09-01 05:16:55.000    05:16:55    2013-09-01   780  34  0.04358974      2
2013-09-01 12:44:33.000    12:44:33    2013-09-01   870  51  0.05862069      4
2013-09-01 12:44:33.000    12:44:33    2013-09-01   730  36  0.04931507      4
2013-09-01 22:14:23.000    22:14:23    2013-09-01   740  43  0.05810811      6
2013-09-01 22:14:23.000    22:14:23    2013-09-01   680  32  0.04705882      6
2013-09-02 08:37:05.000    08:37:05    2013-09-02   700  40  0.05714286      3
2013-09-02 08:37:05.000    08:37:05    2013-09-02   610  35  0.05737705      3
2013-09-02 17:22:33.000    17:22:33    2013-09-02   630  25  0.03968254      5
2013-09-02 17:22:33.000    17:22:33    2013-09-02   670  40  0.05970149      5
2013-09-02 23:59:44.000    23:59:44    2013-09-02   640  37  0.05781250      6
2013-09-02 23:59:44.000    23:59:44    2013-09-02   730  35  0.04794521      6

我搜索了这个网站并尝试了谷歌，但到目前为止，我找不到答案。我尝试了以下代码，我在本网站的答案中找到了该代码，但没有运气。

qaa <- split(df, cut(strptime(paste(df$Start_date, df$Start_time), format = "%Y-%m-%d %H:%M"),"4 hours"))

以前，我试图在几分钟内分割观察次数，所以我尝试调整该代码。说实话，我不知道我在做什么（你可能会说）。

lst<- split(df, df$Start_date)
nobs <- "4 hours" 
List <- unlist(lapply(lst, function(x) {
  x$grp <- rep(1:(nrow(x)/nobs+1), each = nobs)[1:nrow(x)] 
  split(x, x$grp)}), recursive = FALSE)
b <- as.matrix(do.call("rbind", List))

再次告诉你，我是一个关于R的NOOB，所以我花了很多时间来解决所有问题。我对语言知之甚少，但我尽力使其发挥作用。我真的很喜欢与它合作！如果在这个网站上已经有这样的问题，请告诉我，以便我可以删除它。但是我还没有找到它。

感谢您抽出宝贵时间阅读我的问题，并考虑回答它！

Answer 1

如果您可以从开始时间提取开始时间（请尝试此处：Dealing with timestamps in R），则可以使用以下内容分配正确的块编号：

df$block[df$start_hour>=0 & df$start_hour<4]<-1
df$block[df$start_hour>=4 & df$start_hour<8]<-2
df$block[df$start_hour>=8 & df$start_hour<12]<-3
df$block[df$start_hour>=12 & df$start_hour<16]<-4
df$block[df$start_hour>=16 & df$start_hour<20]<-5
df$block[df$start_hour>=20 & df$start_hour<24]<-6

Answer 2

如果你特别安装了lubridate，你会得到帮助，因为它有像小时这样有用的功能。来自Hmisc的cut2允许你为你的小时分配一些简单的括号。

library("lubridate")
library("Hmisc")
example<-as.factor('2013-09-01 00:10:05.000')
example<-data.frame(example,timeslot=cut2(hour(as.POSIXct(example,"%Y-%m-%d %H:%M")),cuts=seq(0,24,4)))

如何将观测的开始时间分配给R中的小时块

2 个答案: