我有一个名为example.csv的文件,其中包含以下数据:
day,number,price,pr
2010-01-01 00:01:00,1,0.4,2
2010-01-01 00:02:00,1,1.2,4
2010-01-01 00:03:00,1,2.5,6
2010-01-01 00:04:00,1,9.1,2
2010-01-01 00:05:00,2,3.4,7
2010-01-01 00:06:00,2,6.9,9
2010-01-01 00:07:00,2,8.9,2
2010-01-01 00:08:00,3,9.1,5
2010-01-01 00:09:00,3,4.2,9
2010-01-01 00:10:00,3,11.2,2
2010-01-01 00:11:00,4,53.12,4
2010-01-01 00:12:00,4,45.21,1
2010-01-01 00:12:00,4,1.1,5
2010-01-01 00:13:00,4,3.43,2
2010-01-01 00:14:00,4,21.42,4
加载数据:
example = read.csv(file="path/example.csv", header=TRUE, sep=",")
基于日
ddx <- xts(x = example[, c("number", "price", "pr" )], order.by = as.POSIXct(example[, "day"], tz = "GMT", format = "%Y-%m-%d %H:%M:%S"))
应用它,它给出的输出是列日和价格:
period.apply(ddx$number, endpoints(ddx, on = "minutes", k = 3), sum)
答案 0 :(得分:1)
您创建xts
的方法相当复杂。试试以下。
txt <- 'day,number,price
2010-01-01 00:01:00,1,0.4
2010-01-01 00:02:00,1,1.2
2010-01-01 00:03:00,1,2.5
2010-01-01 00:04:00,2,9.1
2010-01-01 00:05:00,2,3.4
2010-01-01 00:06:00,2,6.9
2010-01-01 00:07:00,3,8.9
2010-01-01 00:08:00,3,9.1
2010-01-01 00:09:00,3,4.2
2010-01-01 00:10:00,4,11.2
2010-01-01 00:11:00,4,53.12
2010-01-01 00:12:00,4,45.21
2010-01-01 00:12:00,4,1.1
2010-01-01 00:13:00,4,3.43
2010-01-01 00:14:00,4,21.42'
DD <- read.csv(text = txt, stringsAsFactor = FALSE)
# DD is already a dataframe
DD
## day number price
## 1 2010-01-01 00:01:00 1 0.40
## 2 2010-01-01 00:02:00 1 1.20
## 3 2010-01-01 00:03:00 1 2.50
## 4 2010-01-01 00:04:00 2 9.10
## 5 2010-01-01 00:05:00 2 3.40
## 6 2010-01-01 00:06:00 2 6.90
## 7 2010-01-01 00:07:00 3 8.90
## 8 2010-01-01 00:08:00 3 9.10
## 9 2010-01-01 00:09:00 3 4.20
## 10 2010-01-01 00:10:00 4 11.20
## 11 2010-01-01 00:11:00 4 53.12
## 12 2010-01-01 00:12:00 4 45.21
## 13 2010-01-01 00:12:00 4 1.10
## 14 2010-01-01 00:13:00 4 3.43
## 15 2010-01-01 00:14:00 4 21.42
ddx <- xts(x = DD[, c("number", "price")], order.by = as.POSIXct(DD[, "day"], tz = "GMT", format = "%Y-%m-%d %H:%M:%S"))
ddx
## number price
## 2010-01-01 00:01:00 1 0.40
## 2010-01-01 00:02:00 1 1.20
## 2010-01-01 00:03:00 1 2.50
## 2010-01-01 00:04:00 2 9.10
## 2010-01-01 00:05:00 2 3.40
## 2010-01-01 00:06:00 2 6.90
## 2010-01-01 00:07:00 3 8.90
## 2010-01-01 00:08:00 3 9.10
## 2010-01-01 00:09:00 3 4.20
## 2010-01-01 00:10:00 4 11.20
## 2010-01-01 00:11:00 4 53.12
## 2010-01-01 00:12:00 4 45.21
## 2010-01-01 00:12:00 4 1.10
## 2010-01-01 00:13:00 4 3.43
## 2010-01-01 00:14:00 4 21.42
要在号码列上使用period.apply
,只需指定ddx$number
而不是ddx
period.apply(ddx$number, endpoints(ddx, on = "minutes", k = 3), sum)
## number
## 2010-01-01 00:02:00 2
## 2010-01-01 00:05:00 5
## 2010-01-01 00:08:00 8
## 2010-01-01 00:11:00 11
## 2010-01-01 00:14:00 16