Question

我的data.frame看起来像这样：

ACCOUNT  POSTING_DT  WA  Amount
  10019   1/10/2006  19    99.1
  10019   6/18/2007  15   318.5
  10019    7/2/2007  12 23005.1
  10019   3/25/2008  15 16866.3
  10019   9/22/2008  -1 16902.3
  10121   4/18/2006   1 28029.9
  10121   5/28/2006   3   16528
  10121   3/20/2007   1 41730.1

每个帐户都有不同的过帐日期，这些日期不是连续的。我想在当前发布日期前365天使用项目应用计算sum(WA*Amount)/sum(Amount)。

e.g。对于10019项目的3/25/2008帐户，我想使用6/18/2007和7/2/2007项来应用该计算，这些项目为(15*318.5+12*23005.1)/(318.5+23005.1)。

R中有功能吗？

Answer 1

我不知道能做你想做的事情的功能;坦率地说，我有点惊讶，因为它有点＆＃34; niche＆＃34;，但这是可能的。

您的数据：

$(document).ready(function(){
$(".btn").on("click", function(){
var userCurrency = $('#userCurrency option:selected').text();
console.log(userCurrency)
$.ajax({
  type: "GET",
  url: bitcoinApiUrl,
  dataType: "json",
  success: function(currency) {
    // loop through currency
    for (var i = 0; i < currency.length; i++) 
    {
      if(currency[i].currency == "USD")
      {
          var $tr = $("<tr />");
          $tr.append( $("<td />").text(currency[i].volume) );
          $tr.append( $("<td />").text(currency[i].latest_trade) );
          $tr.append( $("<td />").text(currency[i].bid) );
          $tr.append( $("<td />").text(currency[i].high) );

          $("#theTable tbody").append($tr);

      }
    }
  }
  });
});
});

日期列确实应该是txt <- 'ACCOUNT POSTING_DT WA Amount 10019 1/10/2006 19 99.1 10019 6/18/2007 15 318.5 10019 7/2/2007 12 23005.1 10019 3/25/2008 15 16866.3 10019 9/22/2008 -1 16902.3 10121 4/18/2006 1 28029.9 10121 5/28/2006 3 16528 10121 3/20/2007 1 41730.1' dat <- read.table(text=txt, header=TRUE, stringsAsFactors=FALSE)对象的字符串，所以......

Date

这里是：

dat$POSTING_DT <- as.Date(dat$POSTING_DT, format='%m/%d/%Y')

你没有说空套会发生什么。如果您需要将它们设置为零，则可以执行以下操作：

dat$NewAmount <- sapply(1:nrow(dat), function(r) {
    d <- (dat$POSTING_DT[r] - dat$POSTING_DT)
    ## I use both (d>=0) and idx[r] <- FALSE so that if there are multiple
    ## instances on a day, the other ones will still be included
    idx <- (dat$ACCOUNT == dat$ACCOUNT[r]) & (d >= 0) & (d <= 365)
    idx[r] <- FALSE
    ## the crux of this function ("with" is not required but it reads well)
    with(dat, sum(WA[idx] * Amount[idx]) / sum(Amount[idx]))
})
dat
##   ACCOUNT POSTING_DT WA  Amount NewAmount
## 1   10019 2006-01-10 19    99.1       NaN
## 2   10019 2007-06-18 15   318.5       NaN
## 3   10019 2007-07-02 12 23005.1 15.000000
## 4   10019 2008-03-25 15 16866.3 12.040967
## 5   10019 2008-09-22 -1 16902.3 15.000000
## 6   10121 2006-04-18  1 28029.9       NaN
## 7   10121 2006-05-28  3 16528.0  1.000000
## 8   10121 2007-03-20  1 41730.1  1.741866

修改：未经测试

有很多行（如你所说的157k），假设每个帐户有足够的行数，你可能会先将事情分组。这可以使用基本函数（dat$NewAmount <- pmax(dat$NewAmount, 0, na.rm=TRUE) dat ## ACCOUNT POSTING_DT WA Amount NewAmount ## 1 10019 2006-01-10 19 99.1 0.000000 ## 2 10019 2007-06-18 15 318.5 0.000000 ## 3 10019 2007-07-02 12 23005.1 15.000000 ## 4 10019 2008-03-25 15 16866.3 12.040967 ## 5 10019 2008-09-22 -1 16902.3 15.000000 ## 6 10121 2006-04-18 1 28029.9 0.000000 ## 7 10121 2006-05-28 3 16528.0 1.000000 ## 8 10121 2007-03-20 1 41730.1 1.741866？）完成，但我会演示split：

dplyr

这可能是一种更优雅的library(dplyr) dat %>% group_by(ACCOUNT) %>% mutate(NewAmount = sapply(1:n(), function(r) { d <- (dat$POSTING_DT[r] - dat$POSTING_DT) idx <- (dat$ACCOUNT == dat$ACCOUNT[r]) & (d >= 0) & (d <= 365) idx[r] <- FALSE with(dat, sum(WA[idx] * Amount[idx]) / sum(Amount[idx])) })) - esque方法，但在分组中按行进行操作似乎不会尖叫＆＃34;简单的效率＆＃34;对我来说。

编辑2 ：

缺少dplyr（甚至是dplyr！？！），试试这个devtools - 版本：

base

我还没有完成基准测试或大量测试，因此需要注意。

应用计算参考当前日期之前的最近365天数据

1 个答案: