Question

R无法将我的数据表识别为面板，我有几十年的收盘价和总回报率，但是有时之间缺少几个月，因此具有滞后值的简单回报计算不起作用有两个原因：不想获得不相隔1个月的滞后值的回报，现在它可以获取每个公司的回报，而不是每次观察都有一个时间序列。我的解决方法是这样：

df1 <- df %>%
  group_by(seriesid) %>%
  mutate(totret <- ifelse(month(date)-month(lag(date))>1,NA,totalreturn/lag(totalreturn)-1))

names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret") 

df1 <- df1 %>%
group_by(seriesid) %>%
  mutate(closeret <- ifelse(month(date)-month(lag(date))>1,NA,close/lag(close)-1))

names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret", "closeret")

这不是花哨的，但是R不允许采用更高级的解决方案，因为它无法识别新列。我的数据如下：

date company returnprice close seriesid 
1 1888-01-31 x 2.500 2.500 0005 
2 1888-02-04 x 2.750 2.750 0005
3 1888-04-20 x 3.350 3.350 0005 
4 1895-01-30 y 7.500 4.350 0001
5 1895-02-26 y 7.800 4.650 0001

我现在可以获取以下数据：

date company totalreturn close seriesid totret closeret 
1 1888-01-31 x 2.500 2.500 0005 NA NA
2 1888-02-04 x 2.750 2.750 0005 0.1 0.1
3 1888-04-20 x 3.350 3.350 0005 NA NA
4 1895-01-30 y 7.500 4.350 0001 NA NA
5 1895-02-26 y 7.800 4.650 0001 0.04 0.06897

Answer 1

df1 <- df %>%
      group_by(seriesid) %>%
      mutate(totret <- ifelse(month(date)-month(lag(date))>1,NA,totalreturn/lag(totalreturn)-1))

names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret") 

df1 <- df1 %>%
    group_by(seriesid) %>%
    mutate(closeret <- ifelse(month(date)-month(lag(date))>1,NA,close/lag(close)-1))

names(df1) <- c("date","company","totalreturn","close", "seriesid", "ticker","totret", "closeret")

Answer 2

在您的示例之后，我添加了更多日期，只是为了查看当不适用3行以上且您的代码正常工作时会发生什么。但是，从“十二月”>“一月”开始，您会发现并发布新的一年。

data2 <- data %>% mutate(totret = ifelse(month(date)-month(lag(date))>1,NA,totalreturn/lag(totalreturn)-1),
                               closeret = ifelse(month(date)-month(lag(date))>1,NA,close/lag(close)-1))



        date totalreturn close    totret   closeret
1 1888-01-28         2.5   2.5        NA         NA
2 1888-02-28         2.7   2.7 0.0800000 0.08000000
3 1888-03-28         3.0   3.3 0.1111111 0.22222222
4 1888-05-28         3.5   3.5        NA         NA
5 1888-08-28         2.8   4.0        NA         NA
6 1888-10-28         3.0   4.3        NA         NA
7 1888-12-28         3.2   4.5        NA         NA
8 1889-03-28         3.6   4.6 0.1250000 0.02222222

我建议当差异大于31天时使用difftime()并估算NA。

data3 <- data %>% mutate(totret = ifelse(difftime(date, lag(date), units = 'days')>31, NA, totalreturn/lag(totalreturn)-1),
                               closeret = ifelse(difftime(date, lag(date), units = 'days')>31, NA, close/lag(close)-1))

 date totalreturn close    totret  closeret
1 1888-01-28         2.5   2.5        NA        NA
2 1888-02-28         2.7   2.7 0.0800000 0.0800000
3 1888-03-28         3.0   3.3 0.1111111 0.2222222
4 1888-05-28         3.5   3.5        NA        NA
5 1888-08-28         2.8   4.0        NA        NA
6 1888-10-28         3.0   4.3        NA        NA
7 1888-12-28         3.2   4.5        NA        NA
8 1889-03-28         3.6   4.6        NA        NA

我也尝试过difftime(dates[2], dates[1], units = 'secs') > duration(1, units = 'month')，但是因为“月是30.41667天”，所以相差31天都无法使用

计算时间序列数据R

2 个答案: