来自http://chart.yahoo.com的调整后的关闭数据存在问题

时间:2015-03-01 01:52:41

标签: r yahoo-finance

我有一个R投资组合构造代码,它使用来自雅虎的每日调整后的收盘价数据。我对NA值有一些问题,但代码已经工作了一段时间。直到本周末(例如,2015年2月28日)。

现在,当我使用tseries函数get.hist.quote()时,雅虎数据源似乎完全被破坏了。破碎我的意思是它不会正确返回VTV和其他一些ETF的数据。我不知道Yahoo时间序列源是否已关闭或是什么。

有一篇文章(https://stackoverflow.com/a/3507948/2341077)建议将get.hist.quote()中的URL从chart.yahoo.com更改为ichart.yahoo.com可以解决问题。但这对我没有任何改变。我还确保安装了最新版本的tseries。

有没有其他人在雅虎的收盘时间序列中遇到问题?我一直想知道是否应该更改我的代码以使用quantmod函数getSymbols,显然,它可以使用Google财务作为数据源。

编写下面的代码是为了读取数百个ETF符号并返回包含ETF时间序列数据的矩阵。尝试按日期对齐数据。

即使雅虎似乎在提供数据,仍然缺少值,这就是fillHoles()函数要解决的问题。

<pre>

#
# Fill "NA" holes in the time series.
#
fillHoles = function(ts.zoo) {
  v_approx = na.approx(ts.zoo, maxgap=4, na.rm=FALSE)
  v_fill = na.fill(v_approx, fill="extend")
  return( v_fill)
}
<i>
#
# The yahoo market data has problems (at least when it's fetched with get.hist.quote()) when the compression
# argument is used to fetch weekly adjusted close data.
#
# Two time series are shown below, for VXF and MINT. The weekly boundaries appear on different dates.
# 
#              VXF
# 2007-04-04 48.55
# 2007-04-09 48.98
# 2007-04-16 49.52 &lt;==
# 2007-04-23 49.70
# 2007-04-30 50.03
# 2007-05-07 50.04 &lt;==
# 
#            MINT
# 2007-04-04 8.03
# 2007-04-09 8.03
# 2007-04-17 7.88 &lt;==
# 2007-04-23 8.11
# 2007-04-30 8.92
# 2007-05-08 9.14 &lt;==
#   
# If the two time series are merged via a cbind NA values
# end up being inserted where the time series don't line up:'
# 
#              VXF MINT
# 2007-04-04 48.55 8.03
# 2007-04-09 48.98 8.03
# 2007-04-16 49.52   NA
# 2007-04-23 49.70 8.11
# 2007-04-30 50.03 8.92
# 2007-05-07 50.04   NA
#
# To avoid this problem of data alignment, the function fetches daily adjusted close that can then be converted
# into weekly adjusted close.
#
# Given a vector of symbols, this function will fetch the daily adjusted close price data from 
# Yahoo. The data is aligned since not all time series will have exactly the same start and end
# dates (although with daily data, as noted above, this should be less of an issue)
#
</i>
getDailyCloseData = function(symbols, startDate, endDate )
{
  closeData.z = c()
  firstTime = TRUE
  minDate = c()
  maxDate = c()
  fetchedSyms = c()
  startDate.ch = as.character( findMarketDate(as.Date(startDate)))
  endDate.ch = as.character( findMarketDate(as.Date(endDate)))
  for (i in 1:length(symbols)) {
    sym = symbols[i]
    print(sym)
    symClose.z = NULL
    timeOut = 1
    tsEndDate.ch = endDate.ch
    while ((timeOut < 7) && is.null(symClose.z)) {
      try(
        (symClose.z = get.hist.quote(instrument=sym, start=startDate.ch, end=tsEndDate.ch, quote="AdjClose",
                                     provider="yahoo", compression="d", retclass="zoo", quiet=T)),
        silent = TRUE)
      tsEndDate.ch = as.character( findMarketDate( (as.Date(tsEndDate.ch) - 1)))
      timeOut = timeOut + 1
    }
    if (! is.null(symClose.z)) {
      fetchedSyms = c(fetchedSyms, sym)
      dateIx = index(symClose.z)
      if (firstTime) {
        closeData.z = symClose.z
        firstTime = FALSE
        minDate = min(dateIx)
        maxDate = max(dateIx)
      } else {
        minDate = max(minDate, min(dateIx))
        maxDate = min(maxDate, max(dateIx))
        matIx = index(closeData.z)
        repeat {
          startIx = which(matIx == minDate)
          if (length(startIx) > 0 && startIx > 0) {
            break()
          } else {
            minDate = minDate + 1
          }
        } # repeat
        repeat {
           endIx = which(matIx == maxDate)
           if (length(endIx) > 0 && endIx > 0) {
             break()
           } else {
             maxDate = maxDate - 1
           }
        }
        matIxAdj = matIx[startIx:endIx]
        closeData.z = cbind(closeData.z[matIxAdj,], symClose.z[matIxAdj])
      }
    } # if (! is.null(symClose.z))
  } # for
  if (length(closeData.z) > 0) {
    dateIx = index(closeData.z)
     # fill any NA "holes" created by daily date alignment
     closeData.mat = apply(closeData.z, 2, FUN=fillHoles)
     rownames(closeData.mat) = as.character(dateIx)
     colnames(closeData.mat) = fetchedSyms
  }
  return( closeData.mat )
} # getDailyCloseData
</pre>

2 个答案:

答案 0 :(得分:0)

一些观察和问题。您正在使用get.history.quote返回动物园时间序列。您是否尝试过使用zoo包中的merge.zoo来合并来自不同资产的时间历史记录。这应该与日期保持一致,没有任何问题。其次,谷歌和雅虎以不同的方式纠正历史价格,因此两者的价格不同。雅虎提供开盘价,最高价,最低价和收盘价的历史价格,然后调整价格,调整价格以进行拆分,分红和分红。谷歌调整所有价格,但仅限于拆分,忽略股息和分配。您可以使用VXF查看2007年数据的这种差异。

我通过访问Yahoo没有问题 quantmod的getSymbols因此您可以使用它而不是切换到Google。最后,根据Pimco的说法,MINT的开始日期是11/16/2009,所以我不明白你们2007年的数据如何。

xts包是动物园的扩展,我发现它有一些有用的附加功能,例如to.weekly,它在下面使用。下面的代码是使用quantmod和xts包为您的ETF提供每日和每周价格的示例。请注意,MINT数据直到2009年11月17日才开始,与Pimco的开始日期一致。

library(quantmod)
library(xts)

 getDailyCloseData = function(symbols, startDate, endDate ) {
  close_daily  <- getSymbols(symbols[1], src="yahoo", from=startDate, to=endDate, auto.assign=FALSE)[,6] 
  for(sym in symbols[-1]) {
     close_daily <- merge(close_daily, getSymbols(sym, src="yahoo", from=startDate, to=endDate, auto.assign=FALSE)[,6])
   }
    colnames(close_daily) <- symbols
    return(close_daily)
  }

 symbols <- c("VXF","MINT")
 startDate <- "2007-03-15"
 endDate <- Sys.Date()
 close_daily <- getDailyCloseData(symbols, startDate, endDate)
 close_weekly <- to.weekly(close_daily[,1], OHLC=FALSE)
 for(sym in symbols[-1]) {
   close_weekly <- merge(close_weekly, to.weekly(close_daily[,sym], OHLC=FALSE))
  }

答案 1 :(得分:0)

我已切换到使用quantmod()getSymbols函数。 Yahoo数据的问题不一致,因此很难知道这是否是一个完整的解决方案。但是代码比我上面发布的更清晰。

事实是,如果你投资真钱而不仅仅是做量化金融作业,你应该购买专业级数据。

&#13;
&#13;
#
# Find the nearest market date (moving backward in time)
#
findMarketDate = function( date )
{
  while(! isBizday(x = as.timeDate(date), holidays=holidayNYSE(as.numeric(format(date, "%Y"))))) {
    date = date - 1
  }
  return(date)
}

#
# Fill "NA" holes in the time series.
#
fillHoles = function(ts.zoo) {
  v_approx = na.approx(ts.zoo, maxgap=4, na.rm=FALSE)
  v_fill = na.fill(v_approx, fill="extend")
  return( v_fill)
}


#
# Get daily equity market prices (e.g., stocks, ETFs). This code is designed to work
# with both Yahoo and Google. Yahoo is preferred because they have adjusted prices. An adjusted
# price is adjusted for splits and dividends. As a result, an ETF that doesn't move that much in price
# may still move in dividend adjusted price. Using these prices avoids omitting high divident assets.
#
getDailyPriceData = function(symbols, startDate, endDate, dataSource = "yahoo" )
{
  closeData.z = c()
  firstTime = TRUE
  fetchedSyms = c()
  startDate.d = findMarketDate(as.Date(startDate))
  endDate.d = findMarketDate(as.Date(endDate))
  for (i in 1:length(symbols)) {
    sym = symbols[i]
    print(sym)
    close.m = NULL
    timeOut = 1
    while ((timeOut < 7) && is.null(close.m)) {
      try(
        (close.m = getSymbols(Symbols=sym,src=dataSource, auto.assign=getOption('loadSymbols.auto.assign', FALSE),
                              warnings=FALSE)),
        silent = TRUE)
      timeOut = timeOut + 1
    } # while
    if (! is.null(close.m)) {
      dateIx = index(close.m)
      startIx = which(startDate.d == dateIx)
      endIx = which(endDate.d == dateIx)
      if ((length(startIx) > 0 && startIx > 0) && (length(endIx) > 0 && endIx > 0)) {
        fetchedSyms = c(fetchedSyms, sym)
        closeAdj.m = close.m[startIx:endIx,]

        price.z = NULL
        if (dataSource == "yahoo") {
           yahooAdjCol = paste(sym, "Adjusted", sep=".")
           price.z = closeAdj.m[, yahooAdjCol]
        } else {
           highCol = paste(sym, "High", sep=".")
           lowCol = highIx = paste(sym, "Low", sep=".")
           price.z = (closeAdj.m[,highCol] + closeAdj.m[,lowCol])/2
        }
        if (firstTime) {
          closeData.z = price.z
          firstTime = FALSE
        } else {
          closeData.z = cbind(closeData.z, price.z)
        }
      } # if (! is.null(symClose.z))
    } # if not null
  } # for
  closeData.m = c()
  if (length(closeData.z) > 0) {
    dateIx = index(closeData.z)
    closeData.m = coredata(closeData.z)
    numHoles = sum(is.na(closeData.m))
    if (numHoles > 0) {
      # fill any NA "holes" created by daily date alignment
      closeData.m = apply(closeData.m, 2, FUN=fillHoles)
    }
    rownames(closeData.m) = as.character(dateIx)
    colnames(closeData.m) = fetchedSyms
  }
  return( closeData.m )
} # getDailyPriceData
&#13;
&#13;
&#13;