r按因子切割数据帧

时间:2016-03-12 22:12:00

标签: r cut

让我说我有这个

+-------+-----+------+
| Month | Day | Hour |
+-------+-----+------+
|     1 |   1 |    1 |
|     1 |   1 |    2 |
|     1 |   1 |    3 |
|     1 |   1 |    4 |
|     1 |   2 |    1 |
|     1 |   2 |    2 |
|     1 |   2 |    3 |
|     1 |   2 |    4 |
|     2 |   1 |    1 |
|     2 |   1 |    2 |
|     2 |   1 |    3 |
|     2 |   1 |    4 |
+-------+-----+------+

我希望cut按月和日的因素来实现此目标

+-------+-----+------+-------+
| Month | Day | Hour | Block |
+-------+-----+------+-------+
|     1 |   1 |    1 | [1,2] |
|     1 |   1 |    2 | [1,2] |
|     1 |   1 |    3 | [3,4] |
|     1 |   1 |    4 | [3,4] |
|     1 |   2 |    1 | [1,2] |
|     1 |   2 |    2 | [1,2] |
|     1 |   2 |    3 | [3,4] |
|     1 |   2 |    4 | [3,4] |
|     2 |   1 |    1 | [1,2] |
|     2 |   1 |    2 | [1,2] |
|     2 |   1 |    3 | [3,4] |
|     2 |   1 |    4 | [3,4] |
+-------+-----+------+-------+

我认为使用bytapply可能是一种方式,但我无法弄清楚如何。

1 个答案:

答案 0 :(得分:0)

我们可以使用cut为每天的每个小时创建序列,并用括号替换parantheticals:

df1$Block <- cut(df1$Hour, c(1,seq(2,24, by=2)), include.lowest=TRUE)
df1$Block <- sub("(", "[", df1$Block, fixed=T)
df1
#    Month Day Hour Block
# 1      1   1    1 [1,2]
# 2      1   1    2 [1,2]
# 3      1   1    3 [2,4]
# 4      1   1    4 [2,4]
# 5      1   2    1 [1,2]
# 6      1   2    2 [1,2]
# 7      1   2    3 [2,4]
# 8      1   2    4 [2,4]
# 9      2   1    1 [1,2]
# 10     2   1    2 [1,2]
# 11     2   1    3 [2,4]
# 12     2   1    4 [2,4]