我有两个时间序列(动物园)对象和一个数据框
Z1
z1 <- structure(c(400L, 125L, 125L, 125L, 120L,400L, 125L, 125L, 125L, 120L,400L, 125L, 125L, 125L, 120L
,400L, 125L, 125L, 125L, 120L), .Dim = c(5L, 4L), .Dimnames = list(NULL, c("T1", "T2", "T3", "T6"
)), index = structure(c(15723, 15725, 15726, 15727, 15728), class = "Date"),
class = "zoo")
T1 T2 T3 T6
2013-01-18 400 400 400 400
2013-01-20 125 125 125 125
2013-01-21 125 125 125 125
2013-01-22 125 125 125 125
2013-01-23 120 120 120 120
Z2
z2 <- structure(c(40L, 12L, 25L, 15L, 10L,40L, 25L, 15L, 123L, 190L,150L, 115L, 155L, 105L, 80L
,40L, 425L, 225L, 115L, 20L), .Dim = c(5L, 4L), .Dimnames = list(NULL, c("T1", "T2", "T3", "T6"
)), index = structure(c(15723, 15725, 15726, 15727, 15728), class = "Date"),
class = "zoo")
T1 T2 T3 T6
2013-01-18 40 40 150 40
2013-01-20 12 25 115 425
2013-01-21 25 15 155 225
2013-01-22 15 123 105 115
2013-01-23 10 190 80 20
DF
l <- "Name, DOB, TypeOfApply, House
T1, 2008-12-16, sync,44
T2, 2008-12-15, sync,54
T3, 2008-12-19, async,34
T4, 2008-12-18, async,84
T5, 2008-12-11, sync,94"
df <- read.csv(text = l)
我想在TypeOfApply ==“sync”的条件下应用一个公式(我创建的函数使用“calc”)bsaed。 Z1和Z2将具有相同的行和列。
calc(z1,z2,df$DOB-2013-01-18,df$House)
T1 T2 T3 T6
2013-01-18 calc(400,40,((2008-12-16)-(2013-01-18)),44) calc(400,40,((2008-12-15)-(2013-01-18)),54) 400 400
2013-01-20 calc(125,12,((2008-12-16)-(2013-01-20)),44) calc(400,25,((2008-12-15)-(2013-01-20)),54) 125 125
2013-01-21 calc(125,25,((2008-12-16)-(2013-01-21)),44) calc(400,15,((2008-12-15)-(2013-01-21)),54) 125 125
2013-01-22 calc(125,15,((2008-12-16)-(2013-01-22)),44) calc(400,123,((2008-12-15)-(2013-01-22)),54) 125 125
2013-01-23 calc(120,10,((2008-12-16)-(2013-01-23)),44) calc(400,190,((2008-12-15)-(2013-01-23)),54) 120 120
因此,在此代码中,T1和T2将具有要应用的公式,但是其他公司将不会
T3 - 应用类型是异步
T5 - z1和z2中不存在
T6 - 在df中不存在
更新
df中的名称序列可能不同。所以它可能像T2,T1,T3,T5,T4
就像样品计算功能一样
calc <- function(x,y,z,v)
{
val <- x+y+(z/365)+v
return(val)
}
答案 0 :(得分:1)
在这里,我使用str_trim
,因为“df”列中有前导/滞后空格。将“factor”列“DOB”转换为“Date”类,根据“TypeOfApply”元素为“sync”的条件创建“indx”,相应的“Name”元素出现在“z1”的列名中这个“indx”用于子集化“df”,以及“z1”和“z2”。然后使用“Map”函数得到“z1”,“z2”的相应列,“df1”的元素$ DOB“,”df1 $ House“,可用作”calc“函数的输入。
library(stringr)
indx <- intersect(with(df,str_trim(Name[str_trim(TypeOfApply)=='sync'])),
colnames(z1))
df1 <- df[str_trim(as.character(df$Name)) %in% indx,c(2,4)]
df1$DOB <- as.Date(str_trim(df1$DOB))
Map(function(u,v,x,y) calc(u,v, x-'2013-01-18', y),
as.data.frame(z1[,indx]), as.data.frame(z2[,indx]), df1$DOB, df1$House)
使用OP的帖子中的calc
功能
z3 <- z1[,indx]
index <- as.Date('2013-01-18')
z3[] <- mapply(calc, as.data.frame(z1[,indx]),
as.data.frame(z2[,indx]), df1$DOB-index, df1$House)
z3
# T1 T2
#2013-01-18 479.9068 489.9041
#2013-01-20 176.9068 199.9041
#2013-01-21 189.9068 189.9041
#2013-01-22 179.9068 297.9041
#2013-01-23 169.9068 359.9041
假设,如果我改变“df”行的顺序
set.seed(24)
df <- df[sample(1:nrow(df)),]
然后,“Map”列表元素的顺序与“indx”的顺序相同,例如,
indx
#[1] "T2" "T1"
df1
# DOB House
#2 2008-12-15 54
#1 2008-12-16 44
Map(function(u,v,x,y) u, as.data.frame(z1[,indx]),
as.data.frame(z2[,indx]), df1$DOB, df1$House)
#$T2
#[1] 400 125 125 125 120
#$T1
#[1] 400 125 125 125 120