我想合并两个包含单个股票时间序列的数据框,因此每列代表股票的信息。因此,Dataframe 1具有股票价格,而Dataframe 2具有P / E比率。我的目标是准备一个数据帧,我可以使用包backtest,它需要这种格式的数据帧:
library('backtest')
data(starmine)
其结构如下:
date PRICE symbol
date1 4.2 AAPL
date1 6.3 MSFT
date1 2.2 GE
date2 4.1 AAPL
date2 6.3 MSFT
date2 2.5 GE
因此数据集按月分组。我的数据包含多个数据框,每个数据框包含所有股票和所有日期的利息变量(例如价格,市盈率等)。一个例子:
dates <- seq(as.Date("1995/1/1"), by = "month", length.out = 10)
a = sample(0:1,10,rep=TRUE)
b = sample(0:1,10,rep=TRUE)
c = sample(0:1,10,rep=TRUE)
prices = data.frame(dates,a,b,c)
a = sample(0:1,10,rep=TRUE)
b = sample(0:1,10,rep=TRUE)
c = sample(0:1,10,rep=TRUE)
pe = data.frame(dates,a,b,c)
任何人都可以如何合并df1和df2以获得与starmine相同的结构?我想到了这样的事情:
> total <- merge(df1,df2,by=colnames)
Error in as.vector(x, mode) :
cannot coerce type 'closure' to vector of type 'any'
这是我想要获得的结构:
date price pe symbol
1995/1/1 4.2 0.5 a
1995/1/1 6.3 0.4 b
1995/1/1 2.2 0.3 c
1995/2/1 4.1 0.4 a
1995/2/1 6.3 0.2 b
1995/2/1 2.5 0.1 c
1995/3/1 4.2 0.5 a
1995/3/1 6.3 0.4 b
1995/3/1 2.2 0.3 c
1995/4/1 4.1 0.4 a
1995/4/1 6.3 0.2 b
1995/4/1 2.5 0.1 c
答案 0 :(得分:1)
# example data
dates <- seq(as.Date("1995/1/1"), by = "month", length.out = 10)
a = sample(0:1,10,rep=TRUE)
b = sample(0:1,10,rep=TRUE)
c = sample(0:1,10,rep=TRUE)
prices = data.frame(dates,a,b,c)
a = sample(0:1,10,rep=TRUE)
b = sample(0:1,10,rep=TRUE)
c = sample(0:1,10,rep=TRUE)
pe = data.frame(dates,a,b,c)
library(dplyr)
library(tidyr)
# add dataset name as a column
prices$name = "price"
pe$name = "pe"
tbl_df(rbind(prices, pe)) %>%
gather(symbol, value, -dates, -name) %>%
spread(name, value)
# # A tibble: 30 x 4
# dates symbol pe price
# * <date> <chr> <int> <int>
# 1 1995-01-01 a 1 0
# 2 1995-01-01 b 0 1
# 3 1995-01-01 c 0 0
# 4 1995-02-01 a 0 0
# 5 1995-02-01 b 0 1
# 6 1995-02-01 c 0 1
# 7 1995-03-01 a 0 0
# 8 1995-03-01 b 1 0
# 9 1995-03-01 c 0 0
# 10 1995-04-01 a 0 1
# # ... with 20 more rows
我仅将tbl_df(rbind(prices, pe))
用于可视化目的。您并不真正需要tbl_df()
,因此您可以使用rbind(prices, pe)
代替。