Bid请求数据合并基于从Bloomberg下载的历史刻度数据的相同时间戳

时间:2015-07-28 05:16:59

标签: r vba bloomberg

从Bloomberg下载的样本文件如下(a),(b)。生成的文件应为(c)。请帮助您使用一些R代码或Excel或VBA代码。 PS:如果两个时间戳相同,则应采用最高价格,然后优先考虑大小。

(a) TCS IN Equity       
01-04-2015 09:00:00 BID 2515    1
01-04-2015 09:00:04 BID 2553.95 133
01-04-2015 09:00:04 BID 2553.95 168
01-04-2015 09:00:06 BID 2515    1
01-04-2015 09:00:14 BID 2520    5
01-04-2015 09:00:24 BID 2525    3

(b)TCS IN Equity        
01-04-2015 09:00:00 ASK 2594    5
01-04-2015 09:00:04 ASK 2565    1
01-04-2015 09:00:05 ASK 2594    5
01-04-2015 09:00:14 ASK 2570    10
01-04-2015 09:05:28 ASK 2560    5

(c)
TCS IN Equity       BID BID_SIZ OFR OFR_SIZ
01-04-2015 09:00:00 2515    1   2594    5
01-04-2015 09:00:04 2553.95 168 2565    1
01-04-2015 09:00:14 2520    5   2570    10

1 个答案:

答案 0 :(得分:0)

数据

A <- structure(list(date = structure(c(1L, 1L, 1L, 1L, 1L, 1L), 
                           .Label = "01-04-2015", class = "factor"), 
                    time = structure(c(1L, 2L, 2L, 3L, 4L, 5L), 
                           .Label = c("09:00:00", 
                                      "09:00:04", "09:00:06", "09:00:14", 
                                      "09:00:24"), class = "factor"), 
                    type = structure(c(1L, 1L, 1L, 1L, 1L, 1L), 
                           .Label = "BID", class = "factor"), 
                    BID = c(2515, 2553.95, 2553.95, 2515, 2520, 2525),
                    BID_SIZ = c(1L, 133L, 168L, 1L, 5L, 3L)), 
                    .Names = c("date", "time", "type", "BID", "BID_SIZ"),
                    class = "data.frame", row.names = c(NA, -6L))
B <- structure(list(date = structure(c(1L, 1L, 1L, 1L, 1L), 
                           .Label = "01-04-2015", class = "factor"), 
                    time = structure(1:5, 
                           .Label = c("09:00:00", "09:00:04", 
                                      "09:00:05", "09:00:14", "09:05:28"),
                            class = "factor"), 
                    type = structure(c(1L, 1L, 1L, 1L, 1L), 
                           .Label = "ASK", class = "factor"), 
                    OFR = c(2594L, 2565L, 2594L, 2570L, 2560L), 
                    OFR_SIZ = c(5L, 1L, 5L, 10L, 5L)), 
                    .Names = c("date", "time", "type", "OFR", "OFR_SIZ"),
                    class = "data.frame", row.names = c(NA, -5L))

<强>代码

library(dplyr)
inner_join(A, B, by = c("date", "time")) %>% 
     group_by(date, time) %>% 
     arrange(BID, BID_SIZ)  %>% 
     summarise_each(funs(last)) %>%
     select(-type.x, -type.y)
# Source: local data frame [3 x 6]
# Groups: date
# 
#         date     time     BID BID_SIZ  OFR OFR_SIZ
# 1 01-04-2015 09:00:00 2515.00       1 2594       5
# 2 01-04-2015 09:00:04 2553.95     168 2565       1
# 3 01-04-2015 09:00:14 2520.00       5 2570      10

<强>解释

首先,您使用inner_join将两个数据集连接在一起。然后,根据date/time在每个BID/BID_SIZ组合内进行排序,然后选择最后一行。