从现有日期向量创建日期向量

时间:2016-07-15 05:51:58

标签: r

          Date    Price      
       2006-01-03 12.02  
       2006-01-04 11.84  
       2006-01-05 11.83 
       ...  

      EXPIRATION DATES
       2006-01-18
       2006-02-15
       2006-03-22
       ...

您好,我有一个每日期货价格的数据框,并附有相应的日期。我还有期货价格的所有相关合约到期日的向量。

价格列是合约在最近的月份(12个月到期周期)到期的价格。例如,2006-01-03的12.02合约价格将于2006-01-18到期。我想创建一个列,列出每个期货价格的相关到期日期,这样我就可以计算每个每日价格到期前的天数。逻辑是:

2006-01-03和2006-01-18之间的所有日期在新的到期日期列中都有2006-01-18,对于我所有的127个到期日期都是如此。

我试过玩mutate()和subset(),但我没有运气。我认为这将是乏味的,但只需要有人帮助我开始

谢谢

2 个答案:

答案 0 :(得分:1)

假设两个data.frames被称为dfdf2并且日期已经格式化,使用dplyr,

# add a row with a different expiration date to make sure it's working
df[4,] <- list(as.Date('2006-02-04'), 12)

library(dplyr)

df %>% rowwise() %>% 
    mutate(days_left = min(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > Date] - Date))
## Source: local data frame [4 x 3]
## Groups: <by row>
##   
## # A tibble: 4 x 3
##         Date Price      days_left
##       <date> <dbl> <S3: difftime>
## 1 2006-01-03 12.02        15 days
## 2 2006-01-04 11.84        14 days
## 3 2006-01-05 11.83        13 days
## 4 2006-02-04 12.00        11 days

或在基地,

df$days_left <- lapply(df$Date, function(x){
    min(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > x] - x)
})

df
##         Date Price days_left
## 1 2006-01-03 12.02        15
## 2 2006-01-04 11.84        14
## 3 2006-01-05 11.83        13
## 4 2006-02-04 12.00        11

减去日期调用difftime,可能值得明确调用,以便您指定单位:

# dplyr
df %>% rowwise() %>% 
    mutate(days_left = df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > Date] %>% 
               difftime(Date, units = 'days') %>% 
               min())
# base
df$days_left <- lapply(df$Date, function(x){
    min(difftime(df2$EXPIRATION.DATES[df2$EXPIRATION.DATES > x], x, units = 'days'))
})

根据您的数据,它可能没有什么区别,但它比简单的减法更有效。

答案 1 :(得分:0)

免责声明:我不喜欢烟斗(我有我的理由),当我找到一个好的&#34; Base R&#34;解决方案,我首先去那个。所以,这是我的旧屁解决方案。

我添加了更多数据,以确保它真正按预期工作。

# Create main dataframe
df1 <- read.table(text=
"Date Price
2006-01-03 12.02
2006-01-18 12.04
2006-01-22 12.05
2006-02-01 11.99
2006-02-16 11.84
2006-03-21 11.83
2006-03-22 11.90
2006-03-29 12.00
", head=T, stringsAsFactors=FALSE)

# Convert Date column to a proper Date-classed column
df1$Date <- as.Date(df1$Date)

# Generate an expiration dates vector
exp_dates <- as.Date(c("2006-01-18", "2006-02-15", "2006-03-22", "2006-04-18"))

# initialize df1$exp_dates
df1$exp_date <- NA
class(df1$exp_date) <- "Date"

# Loop over rows and find closest expir. date which is not past the date
for(i in 1:nrow(df1))
  df1$exp_date[i] <- exp_dates[which.max((df1$Date[i]-exp_dates) <= 0)]

(是的,我也循环,我甚至喜欢它!:^ p)

df1

        Date Price   exp_date
1 2006-01-03 12.02 2006-01-18
2 2006-01-18 12.04 2006-01-18
3 2006-01-22 12.05 2006-02-15
4 2006-02-01 11.99 2006-02-15
5 2006-02-16 11.84 2006-03-22
6 2006-03-21 11.83 2006-03-22
7 2006-03-22 11.90 2006-03-22
8 2006-03-29 12.00 2006-04-18