如何计算每5行数据帧的最大数量?

时间:2018-05-30 20:53:11

标签: arrays r dataframe vector statistics

我一直在努力弄清楚如何计算每5个前一行的数据框列的最大数量,并将此输入放入一列。

我现在所拥有的是:

a.1$StoHH <- 0
for (i in nrow(a.1)) {
    a.1$StoHH[i] <- max(a.1$H[i:StocRan+i])
}

给出了输出:

                O        H        L        C    StoHH
2007-01-01 9.236891 9.243043 9.157491 9.253262 0.000000
2007-01-02 9.243480 9.242502 9.105448 9.169322 0.000000
2007-01-03 9.169322 9.231852 9.071093 9.179970 0.000000
2007-01-04 9.184184 9.292451 9.153519 9.291148 0.000000
2007-01-05 9.287510 9.431302 9.218613 9.398015 9.431302
2007-01-08 9.411260 9.453809 9.330134 9.397029 9.453809
2007-01-09 9.406297 9.521521 9.296135 9.466344 9.521521
2007-01-10 9.459888 9.558271 9.432981 9.432194 9.558271
2007-01-11 9.425640 9.419360 9.364269 9.391869 9.419360
2007-01-12 9.397367 9.379746 9.270332 9.348296 0.000000
2007-01-15 9.326662 9.321749 9.244863 9.307797 0.000000
2007-01-16 9.307486 9.356671 9.292282 9.329578 0.000000
2007-01-17 9.329319 9.387363 9.213934 9.267339 0.000000
2007-01-18 9.266175 9.243464 9.194254 9.275096 0.000000
2007-01-19 9.275817 9.261676 9.212665 9.238850 0.000000
2007-01-22 9.269539 9.245211 9.130079 9.240870 0.000000
2007-01-23 9.240870 9.234062 9.186320 9.241538 0.000000
2007-01-24 9.242912 9.291265 9.198295 9.228057 0.000000
2007-01-25 9.227160 9.399910 9.184509 9.388069 0.000000
2007-01-26 9.388518 9.422363 9.298804 9.409148 0.000000

正如您所看到的,代码既没有采用&#39; max&#39;或正确地沿向量迭代。我也使用了seq_along,但是也有类似的结果。

非常感谢任何帮助。

2 个答案:

答案 0 :(得分:1)

您可以使用group_by中的dplyr创建group_id并计算最大值:

library(tidyverse)

df %>%
  rownames_to_column("Date") %>%
  group_by(group_id = c(0, rep(1:(nrow(df)-1)%/%5))) %>%
  mutate(StoHH = max(H))

<强>结果:

# A tibble: 20 x 7
# Groups:   group_id [4]
         Date        O        H        L        C group_id    StoHH
        <chr>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
 1 2007-01-01 9.236891 9.243043 9.157491 9.253262        0 9.431302
 2 2007-01-02 9.243480 9.242502 9.105448 9.169322        0 9.431302
 3 2007-01-03 9.169322 9.231852 9.071093 9.179970        0 9.431302
 4 2007-01-04 9.184184 9.292451 9.153519 9.291148        0 9.431302
 5 2007-01-05 9.287510 9.431302 9.218613 9.398015        0 9.431302
 6 2007-01-08 9.411260 9.453809 9.330134 9.397029        1 9.558271
 7 2007-01-09 9.406297 9.521521 9.296135 9.466344        1 9.558271
 8 2007-01-10 9.459888 9.558271 9.432981 9.432194        1 9.558271
 9 2007-01-11 9.425640 9.419360 9.364269 9.391869        1 9.558271
10 2007-01-12 9.397367 9.379746 9.270332 9.348296        1 9.558271
11 2007-01-15 9.326662 9.321749 9.244863 9.307797        2 9.387363
12 2007-01-16 9.307486 9.356671 9.292282 9.329578        2 9.387363
13 2007-01-17 9.329319 9.387363 9.213934 9.267339        2 9.387363
14 2007-01-18 9.266175 9.243464 9.194254 9.275096        2 9.387363
15 2007-01-19 9.275817 9.261676 9.212665 9.238850        2 9.387363
16 2007-01-22 9.269539 9.245211 9.130079 9.240870        3 9.422363
17 2007-01-23 9.240870 9.234062 9.186320 9.241538        3 9.422363
18 2007-01-24 9.242912 9.291265 9.198295 9.228057        3 9.422363
19 2007-01-25 9.227160 9.399910 9.184509 9.388069        3 9.422363
20 2007-01-26 9.388518 9.422363 9.298804 9.409148        3 9.422363

如果您要查找宽度为5的最大滚动,则可以使用rollapply中的zoo

library(zoo)

df$StoHH = rollapply(df$H, width = 5, max, fill = 0, align = "right")

<强>结果:

                  O        H        L        C    StoHH
2007-01-01 9.236891 9.243043 9.157491 9.253262 0.000000
2007-01-02 9.243480 9.242502 9.105448 9.169322 0.000000
2007-01-03 9.169322 9.231852 9.071093 9.179970 0.000000
2007-01-04 9.184184 9.292451 9.153519 9.291148 0.000000
2007-01-05 9.287510 9.431302 9.218613 9.398015 9.431302
2007-01-08 9.411260 9.453809 9.330134 9.397029 9.453809
2007-01-09 9.406297 9.521521 9.296135 9.466344 9.521521
2007-01-10 9.459888 9.558271 9.432981 9.432194 9.558271
2007-01-11 9.425640 9.419360 9.364269 9.391869 9.558271
2007-01-12 9.397367 9.379746 9.270332 9.348296 9.558271
2007-01-15 9.326662 9.321749 9.244863 9.307797 9.558271
2007-01-16 9.307486 9.356671 9.292282 9.329578 9.558271
2007-01-17 9.329319 9.387363 9.213934 9.267339 9.419360
2007-01-18 9.266175 9.243464 9.194254 9.275096 9.387363
2007-01-19 9.275817 9.261676 9.212665 9.238850 9.387363
2007-01-22 9.269539 9.245211 9.130079 9.240870 9.387363
2007-01-23 9.240870 9.234062 9.186320 9.241538 9.387363
2007-01-24 9.242912 9.291265 9.198295 9.228057 9.291265
2007-01-25 9.227160 9.399910 9.184509 9.388069 9.399910
2007-01-26 9.388518 9.422363 9.298804 9.409148 9.422363

数据:

df = structure(list(O = c(9.236891, 9.24348, 9.169322, 9.184184, 9.28751, 
9.41126, 9.406297, 9.459888, 9.42564, 9.397367, 9.326662, 9.307486, 
9.329319, 9.266175, 9.275817, 9.269539, 9.24087, 9.242912, 9.22716, 
9.388518), H = c(9.243043, 9.242502, 9.231852, 9.292451, 9.431302, 
9.453809, 9.521521, 9.558271, 9.41936, 9.379746, 9.321749, 9.356671, 
9.387363, 9.243464, 9.261676, 9.245211, 9.234062, 9.291265, 9.39991, 
9.422363), L = c(9.157491, 9.105448, 9.071093, 9.153519, 9.218613, 
9.330134, 9.296135, 9.432981, 9.364269, 9.270332, 9.244863, 9.292282, 
9.213934, 9.194254, 9.212665, 9.130079, 9.18632, 9.198295, 9.184509, 
9.298804), C = c(9.253262, 9.169322, 9.17997, 9.291148, 9.398015, 
9.397029, 9.466344, 9.432194, 9.391869, 9.348296, 9.307797, 9.329578, 
9.267339, 9.275096, 9.23885, 9.24087, 9.241538, 9.228057, 9.388069, 
9.409148)), class = "data.frame", row.names = c("2007-01-01", 
"2007-01-02", "2007-01-03", "2007-01-04", "2007-01-05", "2007-01-08", 
"2007-01-09", "2007-01-10", "2007-01-11", "2007-01-12", "2007-01-15", 
"2007-01-16", "2007-01-17", "2007-01-18", "2007-01-19", "2007-01-22", 
"2007-01-23", "2007-01-24", "2007-01-25", "2007-01-26"), .Names = c("O", 
"H", "L", "C"))

答案 1 :(得分:1)

我假设OP正在寻找所有数据帧列的最大滚动。可以将dplyr::mutate_allzoo::rollapply结合使用来计算5行(包括当前行)的最大滚动。解决方案如下:

library(dplyr)
library(zoo)

df %>% mutate_all(funs(max = rollapply(.,5, max, align="right", partial = TRUE)))

或:基于@r2evans仅使用zoo::rollmax作为

的建议
df %>% mutate_all(funs(max = rollmax(.,5, max, align="right", partial = TRUE)))

<强>结果:

#           O        H        L        C    O_max    H_max    L_max    C_max
# 1  9.236891 9.243043 9.157491 9.253262 9.236891 9.243043 9.157491 9.253262
# 2  9.243480 9.242502 9.105448 9.169322 9.243480 9.243043 9.157491 9.253262
# 3  9.169322 9.231852 9.071093 9.179970 9.243480 9.243043 9.157491 9.253262
# 4  9.184184 9.292451 9.153519 9.291148 9.243480 9.292451 9.157491 9.291148
# 5  9.287510 9.431302 9.218613 9.398015 9.287510 9.431302 9.218613 9.398015
# 6  9.411260 9.453809 9.330134 9.397029 9.411260 9.453809 9.330134 9.398015
# 7  9.406297 9.521521 9.296135 9.466344 9.411260 9.521521 9.330134 9.466344
# 8  9.459888 9.558271 9.432981 9.432194 9.459888 9.558271 9.432981 9.466344
# 9  9.425640 9.419360 9.364269 9.391869 9.459888 9.558271 9.432981 9.466344
# 10 9.397367 9.379746 9.270332 9.348296 9.459888 9.558271 9.432981 9.466344
# 11 9.326662 9.321749 9.244863 9.307797 9.459888 9.558271 9.432981 9.466344
# 12 9.307486 9.356671 9.292282 9.329578 9.459888 9.558271 9.432981 9.432194
# 13 9.329319 9.387363 9.213934 9.267339 9.425640 9.419360 9.364269 9.391869
# 14 9.266175 9.243464 9.194254 9.275096 9.397367 9.387363 9.292282 9.348296
# 15 9.275817 9.261676 9.212665 9.238850 9.329319 9.387363 9.292282 9.329578
# 16 9.269539 9.245211 9.130079 9.240870 9.329319 9.387363 9.292282 9.329578
# 17 9.240870 9.234062 9.186320 9.241538 9.329319 9.387363 9.213934 9.275096
# 18 9.242912 9.291265 9.198295 9.228057 9.275817 9.291265 9.212665 9.275096
# 19 9.227160 9.399910 9.184509 9.388069 9.275817 9.399910 9.212665 9.388069
# 20 9.388518 9.422363 9.298804 9.409148 9.388518 9.422363 9.298804 9.409148