R-用于同一天的所有重复项

时间:2018-11-15 13:13:34

标签: r

我有一个数据库,每天重复几次,因此同一日期有几行。 (顺便说一句,我使用的是lubridate包)。

我想做的是:

创建价格第一最低和第一最高的列T1和列T2。 T1将返回空白/ NA单元格,但找到第一个最高价和最低价的行除外。但是,这就是我遇到的问题,我希望它考虑重复项。因此,这就像一个循环:对于第一组重复项,找到T1和T2,然后移至第二组重复项,依此类推。...

newdf4<-Data %>%
mutate(T1= max(which(settle < 120)))%>%
mutate(T2=min(which(settle> 120)))

这是我的数据的样子:

 <date>     <dttm>               <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl>  <lgl> 
1 2002-01-02 2002-01-10 00:00:00   118   125   125  125.  125.    55 NA     NA    
2 2002-01-02 2002-03-11 00:00:00   125     NA    NA   NA    NA      0 NA     NA    
3 2002-01-02 2002-05-10 00:00:00   128    NA    NA   NA    NA      0 NA     NA    
4 2002-01-02 2002-07-10 00:00:00   127     NA    NA   NA    NA      0 NA     NA    
5 2002-01-02 2002-09-10 00:00:00   130     NA    NA   NA    NA      0 NA     NA    
6 2002-01-02 2002-11-11 00:00:00   180    120   120  120   120      5 NA     NA   

非常感谢。

编辑:

 dput(head(Data))
 structure(list(Date = structure(c(11689, 11689, 11689, 11689, 
 11689, 11689), class = "Date"), Echeance = structure(c(1010620800, 
 1015804800, 1020988800, 1026259200, 1031616000, 1036972800), class =     
 c("POSIXct", "POSIXt"), tzone = "UTC"), Settle = c(118, 125, 128, 127, 
 130, 180), Open = c(125, NA, NA, NA, NA, 120), Haut = c(125, 
 NA, NA, NA, NA, 120), Bas = c(124.75, NA, NA, NA, NA, 120), Close =     
 c(124.75, NA, NA, NA, NA, 120), Vol_Q = c(55, 0, 0, 0, 0, 5), Bloc_Q = c(NA, 
 NA, NA, NA, NA, NA), Trades = c(NA, NA, NA, NA, NA, NA), `Vol_€` =     
 c(343062.5, 
 0, 0, 0, 0, 30000), O.I. = c(908, 3645, 1603, 100, 157, 1210)), row.names =          
 c(NA,-6L), class = c("tbl_df", "tbl", "data.frame"))

1 个答案:

答案 0 :(得分:1)

我会这样。调整select以包括所需的任何列。

Data %>%
  group_by(Date)  %>%
  mutate(t1_cond = Settle < 120,
         t2_cond = Settle > 120,
         T1 = if_else(row_number() == which.max(t1_cond) & t1_cond, Settle, NA_real_),
         T2 = if_else(row_number() == which.max(t2_cond) & t2_cond, Settle, NA_real_)) %>%
  select(Date, T1, T2)

# # A tibble: 6 x 3
# # Groups:   Date [1]
#   Date          T1    T2
#   <date>     <dbl> <dbl>
# 1 2002-01-02   118    NA
# 2 2002-01-02    NA   125
# 3 2002-01-02    NA    NA
# 4 2002-01-02    NA    NA
# 5 2002-01-02    NA    NA
# 6 2002-01-02    NA    NA