使用数据框对多个动物园对象进行操作

时间:2015-01-19 12:42:20

标签: r

我有两个时间序列(动物园)对象和一个数据框
Z1

z1 <- structure(c(400L, 125L, 125L, 125L, 120L,400L, 125L, 125L, 125L, 120L,400L, 125L, 125L, 125L, 120L
,400L, 125L, 125L, 125L, 120L), .Dim = c(5L, 4L), .Dimnames = list(NULL, c("T1", "T2", "T3", "T6"
)), index = structure(c(15723, 15725, 15726, 15727, 15728), class = "Date"), 
class = "zoo")



            T1  T2  T3  T6
2013-01-18 400 400 400 400
2013-01-20 125 125 125 125
2013-01-21 125 125 125 125
2013-01-22 125 125 125 125
2013-01-23 120 120 120 120

Z2

z2 <- structure(c(40L, 12L, 25L, 15L, 10L,40L, 25L, 15L, 123L, 190L,150L, 115L, 155L, 105L, 80L
,40L, 425L, 225L, 115L, 20L), .Dim = c(5L, 4L), .Dimnames = list(NULL, c("T1", "T2", "T3", "T6"
)), index = structure(c(15723, 15725, 15726, 15727, 15728), class = "Date"), 
class = "zoo")


            T1  T2  T3  T6
2013-01-18 40  40 150  40
2013-01-20 12  25 115 425
2013-01-21 25  15 155 225
2013-01-22 15 123 105 115
2013-01-23 10 190  80  20

DF

l <- "Name, DOB, TypeOfApply, House
        T1, 2008-12-16, sync,44
        T2, 2008-12-15, sync,54
        T3, 2008-12-19, async,34
        T4, 2008-12-18, async,84
        T5, 2008-12-11, sync,94"

df <- read.csv(text = l)

我想在TypeOfApply ==“sync”的条件下应用一个公式(我创建的函数使用“calc”)bsaed。 Z1和Z2将具有相同的行和列。

calc(z1,z2,df$DOB-2013-01-18,df$House)
                    T1                                  T2                                           T3  T6
2013-01-18 calc(400,40,((2008-12-16)-(2013-01-18)),44) calc(400,40,((2008-12-15)-(2013-01-18)),54)  400 400
2013-01-20 calc(125,12,((2008-12-16)-(2013-01-20)),44) calc(400,25,((2008-12-15)-(2013-01-20)),54)  125 125
2013-01-21 calc(125,25,((2008-12-16)-(2013-01-21)),44) calc(400,15,((2008-12-15)-(2013-01-21)),54)  125 125
2013-01-22 calc(125,15,((2008-12-16)-(2013-01-22)),44) calc(400,123,((2008-12-15)-(2013-01-22)),54) 125 125
2013-01-23 calc(120,10,((2008-12-16)-(2013-01-23)),44) calc(400,190,((2008-12-15)-(2013-01-23)),54) 120 120

因此,在此代码中,T1和T2将具有要应用的公式,但是其他公司将不会 T3 - 应用类型是异步
T5 - z1和z2中不存在 T6 - 在df中不存在

更新
df中的名称序列可能不同。所以它可能像T2,T1,T3,T5,T4

就像样品计算功能一样

 calc <- function(x,y,z,v)
 {
   val <- x+y+(z/365)+v
   return(val)
 }

1 个答案:

答案 0 :(得分:1)

在这里,我使用str_trim,因为“df”列中有前导/滞后空格。将“factor”列“DOB”转换为“Date”类,根据“TypeOfApply”元素为“sync”的条件创建“indx”,相应的“Name”元素出现在“z1”的列名中这个“indx”用于子集化“df”,以及“z1”和“z2”。然后使用“Map”函数得到“z1”,“z2”的相应列,“df1”的元素$ DOB“,”df1 $ House“,可用作”calc“函数的输入。

library(stringr)
indx <- intersect(with(df,str_trim(Name[str_trim(TypeOfApply)=='sync'])),
              colnames(z1))

df1 <- df[str_trim(as.character(df$Name)) %in% indx,c(2,4)]
df1$DOB <- as.Date(str_trim(df1$DOB))
Map(function(u,v,x,y) calc(u,v, x-'2013-01-18', y),
 as.data.frame(z1[,indx]), as.data.frame(z2[,indx]), df1$DOB, df1$House)

更新

使用OP的帖子中的calc功能

z3 <- z1[,indx]
index <- as.Date('2013-01-18')
z3[] <- mapply(calc, as.data.frame(z1[,indx]), 
    as.data.frame(z2[,indx]), df1$DOB-index, df1$House)
z3
#               T1       T2
#2013-01-18 479.9068 489.9041
#2013-01-20 176.9068 199.9041
#2013-01-21 189.9068 189.9041
#2013-01-22 179.9068 297.9041
#2013-01-23 169.9068 359.9041

假设,如果我改变“df”行的顺序

set.seed(24)
df <- df[sample(1:nrow(df)),]

然后,“Map”列表元素的顺序与“indx”的顺序相同,例如,

indx
#[1] "T2" "T1"
df1
#         DOB House
#2 2008-12-15    54
#1 2008-12-16    44

 Map(function(u,v,x,y)  u, as.data.frame(z1[,indx]), 
   as.data.frame(z2[,indx]), df1$DOB, df1$House)
 #$T2
 #[1] 400 125 125 125 120

 #$T1
 #[1] 400 125 125 125 120