我有特定日期的每一分钟的逐笔数据集。像这样:
Date Time Open High Low Close Volume Tick.Count Time2 Date2 date_time
1997-09-10 00:01 0 0 0 0 0 0 00:01 1997/09/10 1997-09-10 00:01:00
1997-09-10 00:02 0 0 0 0 0 0 00:02 1997/09/10 1997-09-10 00:02:00
为方便起见,我只取了其中没有真正价格的第一行。 如果全天的 Volume
低于 100,我想删除完整的交易日。
有人知道怎么做吗?
复制代码(5 行):
df <- structure(list(Date = structure(c(10114, 10114, 10114, 10114,
10114), class = "Date"), Time = c("00:01", "00:02", "00:03",
"00:04", "00:05"), Open = c(0, 0, 0, 0, 0), High = c(0, 0, 0,
0, 0), Low = c(0, 0, 0, 0, 0), Close = c(0, 0, 0, 0, 0), Volume = c(0L,
0L, 0L, 0L, 0L), Tick.Count = c(0L, 0L, 0L, 0L, 0L), Time2 = c("00:01",
"00:02", "00:03", "00:04", "00:05"), Date2 = c("1997/09/10",
"1997/09/10", "1997/09/10", "1997/09/10", "1997/09/10"), date_time = structure(list(
sec = c(0, 0, 0, 0, 0), min = 1:5, hour = c(0L, 0L, 0L, 0L,
0L), mday = c(10L, 10L, 10L, 10L, 10L), mon = c(8L, 8L, 8L,
8L, 8L), year = c(97L, 97L, 97L, 97L, 97L), wday = c(3L,
3L, 3L, 3L, 3L), yday = c(252L, 252L, 252L, 252L, 252L),
isdst = c(1L, 1L, 1L, 1L, 1L), zone = c("CEST", "CEST", "CEST",
"CEST", "CEST"), gmtoff = c(NA_integer_, NA_integer_, NA_integer_,
NA_integer_, NA_integer_)), class = c("POSIXlt", "POSIXt"
))), row.names = c(NA, 5L), class = "data.frame")
先谢谢你。 亲切的问候, 于尔根
答案 0 :(得分:4)
您可以使用 ave
为每个 sum
构建 Volume
的 Date
并比较它是否为 >= 100
并使用它对 {{1} } 使用df
:
[
答案 1 :(得分:0)
这里有 dplyr
和 data.table
替代方案 -
#1. dplyr
library(dplyr)
df %>% group_by(Date) %>% filter(sum(Volume, na.rm = TRUE) >= 100) %>% ungroup
#2. data.table
library(data.table)
setDT(df)[, .SD[sum(Volume, na.rm = TRUE) >= 100], Date]
答案 2 :(得分:0)
使用dplyr
library(dplyr)
df %>%
group_by(Date) %>%
slice(which(sum(Volume, na.rm = TRUE) >= 100)) %>%
ungroup