R:按行和列

时间:2017-03-09 21:27:29

标签: r function dataframe matching

解决

假设我们有给定的:

#Defining sample variables    

    set.seed(1) ##Note I didn't set seed for the values below so your numbers will be different

    date <- as.Date(c('2015-1-1', '2015-1-1', '2015-1-3', '2015-1-3', '2015-1-5', '2015-1-5'))
    variable1 <- runif(6, max=1, min=0)
    date2 <- as.Date(c('2015-1-1', '2015-1-3', '2015-1-5'))
    variable2 <- runif(3, max=2, min=1)
    variable3 <- runif(3, max=5, min=4)
    df1 <- data.frame(date, variable1)
    df2 <- data.frame(date2, variable2, variable3)

#Sample dataframes

    #df1
       Date variable1
1 2015-01-01 0.2655087
2 2015-01-01 0.3721239
3 2015-01-03 0.5728534
4 2015-01-03 0.9082078
5 2015-01-05 0.2016819
6 2015-01-05 0.8983897

   #df2
        Date variable2 variable3
1 2015-01-01  1.646115  4.706171
2 2015-01-03  1.457847  4.549162
3 2015-01-05  1.015068  4.735463

我想定义一些基于variable2或variable3吐出值的函数,具体取决于variable1的值。

到目前为止我所拥有的:

    SomeVariable <- function(x){
            if (x < 0.5) 
                    df2$variable2
            else
                    df2$variable3
    }

    SomeVariable(df1$variable1[1])
[1] 1.646115 1.457847 1.015068

但不知怎的,我需要这样做,以便函数也匹配变量1,变量2和变量3的值按日期

例如,第一个条目的SomeVariable只返回1.646,而最后一个条目的SomeVariable只返回4.735。

    #Final output should be:
         Date  SomeVariable 
    1 2015-01-01 1.646115
    2 2015-01-01 1.646115
    3 2015-01-03 4.549162
    4 2015-01-03 4.549162
    5 2015-01-05 1.015068
    6 2015-01-05 4.735463

1 个答案:

答案 0 :(得分:0)

也许我完全误解了你想要的东西,但我认为你不需要复杂的功能。

使用set seed获取可重现的数据

set.seed(123)
date <- as.Date(c('2015-1-1', '2015-1-1', '2015-1-3', '2015-1-3', '2015-1-5', '2015-1-5'))
variable1 <- runif(6, max=1, min=0)
date2 <- as.Date(c('2015-1-1', '2015-1-3', '2015-1-5'))
variable2 <- runif(3, max=2, min=1)
variable3 <- runif(3, max=5, min=4)
df1 <- data.frame(date, variable1)
df2 <- data.frame(date, variable2, variable3)

使用match()函数

按日期将列variable1添加到df2
Indices<-match(df2$date, df1$date)
df2$variable1 <- df1$variable1[Indices]

根据variable1的值

添加列SomeVar
df2$SomeVar[df2$variable1<0.5] <- df2$variable2[df2$variable1<0.5] 
df2$SomeVar[df2$variable1>0.5] <- df2$variable3[df2$variable1>0.5]