我下面的代码为我提供了股票的时间序列结果,并将所有内容分为“买”和“卖”两个类别(基于高于或低于开盘价的收盘价)。
library(dplyr)
library(data.table)
library(quantmod)
library(zoo)
# enter tickers to download time-series data
e <- new.env()
getSymbols("SBUX", env = e)
pframe <- do.call(merge, as.list(e))
#head(pframe)
# get a subset of data
df = pframe$SBUX.Close
colnames(df)[1] <- "Close"
head(df)
# Assign groupings
addGrps <- transform(df,Group = ifelse(Close < lead(Close), "S", "B"))
# create subsets
buys <- addGrps[addGrps$Group == 'B',]
sells <- addGrps[addGrps$Group == 'S',]
现在,我正在尝试将结果按每日利润(Diff)和亏损进行分组,以求出两者的累计总和(利润和亏损)。
我认为应该是这样,但是有些问题了,我不确定是什么。
# find daily differences
df <- df %>%
mutate(Diff = addGrps$Close - lead(addGrps$Close))
# get up and down price movements
ups <- filter(df, Diff > 0 )
downs <- filter(df, Diff <= 0 )
# cumulative sums of longs and shorts
longs<-cumsum(ups$Diff)
shorts<-cumsum(downs$Diff)
答案 0 :(得分:1)
我不确定我是否完全关注您的问题/问题,而且似乎有一些不必要的代码。例如,所有这些软件包都是不需要的(至少现在还不需要), 目前尚不清楚为什么需要两个用于买卖的子数据框。至少,以下内容将清理您到目前为止所做的一些工作,并在易于使用的数据框架中获取数据。经过澄清,也许这是一个开始。
library(quantmod)
library(tidyverse) # rather than just dplyr
# pull the SBUX data as a data frame and create the necessary new columns:
df <- data.frame(getSymbols(Symbols = 'SBUX', env = NULL)) %>% # pull the raw data
rownames_to_column('date') %>% # convert the row index to a column
select(date, close = SBUX.Close) %>% # select only the SBUX.Close column and rename it
mutate(group = ifelse(close < lead(close), 's', 'b')) %>% # assign the sell or buy group
mutate(diff = close - lead(close)) %>% # create the diff calculation
mutate(movement = ifelse(diff > 0, 'up', 'down')) %>% # create the movement classification
tbl_df()
# just to view the new data frame:
df %>% head(5)
# A tibble: 5 x 5
date close group diff movement
<chr> <dbl> <chr> <dbl> <chr>
1 2007-01-03 17.6 s -0.0200 down
2 2007-01-04 17.6 b 0.0750 up
3 2007-01-05 17.6 b 0.0650 up
4 2007-01-08 17.5 b 0.0750 up
5 2007-01-09 17.4 b 0.0550 up
# calculate the sums of the diff by the movement up or down:
df %>%
filter(!is.na(movement)) %>% # this removes the last date from the data - it cannot have a lead closing price
group_by(movement) %>%
summarize(cum_sum = sum(diff))
# A tibble: 2 x 2
movement cum_sum
<chr> <dbl>
1 down -489.
2 up 455.