涉及数据框中相互依赖列的计算

时间:2019-03-23 07:51:22

标签: r dataframe

Data <- data.frame(
  A = rnorm(1:10),
  B = rnorm(1:10),
  C = rnorm(1:10),
  D = rnorm(1:10),
  X = sample(c(1,0), size=10, replace = TRUE)
)

在“数据”数据框中,我需要在Y和Z两列中创建

首先观察到的Y等于一个常量,比如说10

在这种情况下,从第2行到第10行的Y的第二次和后续观察将等于= 10 + Z [先前的观察]

示例

if Z[1] = 10 then Y[2] = 10 + Z[1] = 20

if Z[2] = 20 then Y[3] = 10 + Z[2] = 30

if Z[3] = 50 then Y[4] = 10 + Z[3] = 60

依次类推,直到Data数据帧的长度为止。


现在就Z而言,其计算如下:

First observation of Z is IF(X1>0,0,Y1-SUM(A1+B1+C1+D1))
Second observation of Z is IF(X2>0,0,Y2-SUM(A2+B2+C2+D2))
Third observation of Z is IF(X3>0,0,Y3-SUM(A3+B3+C3+D3))

即检查同一行的X观察值是否> 0,然后该行的Z为0,否则为Y1-同一行的sum(A:D)

依此类推,直到数据帧的长度。

1 个答案:

答案 0 :(得分:3)

for循环中,上一行的每一行都存在相互依赖关系

Data$Y <- 10
Data$Z <- 0

for (i in seq_len(nrow(Data))) {
   if(i > 1)
    Data$Y[i] <- Data$Y[1] + Data$Z[i-1]

   if(Data$X[i] > 0) 
      Data$Z[i] <- 0 
   else
      Data$Z[i] <- with(Data, Y[i] - (A[i]+B[i]+C[i]+D[i]))
}

Data
#            A           B          C          D X        Y         Z
#1   0.6607697  0.05785971  0.2053518 -0.3366816 0 10.00000  9.412700
#2  -2.0529830  1.10397550  1.0162833 -0.6592560 0 19.41270 20.004681
#3  -1.4992061 -0.02561697  0.4089990 -0.1007670 1 30.00468  0.000000
#4   1.4712331  0.51484639 -0.7052355  0.7039078 0 10.00000  8.015248
#5   1.4591385  0.99005668  0.2356143  0.4380549 0 18.01525 14.892384
#6   0.1401390  0.30345432  0.3451300  1.4971479 0 24.89238 22.606512
#7   0.2091844 -0.93007223 -1.0197996  0.2162444 1 32.60651  0.000000
#8  -3.0360898  0.08403068 -1.4118223 -1.4953378 0 10.00000 15.859219
#9  -0.4869341  0.52677963 -1.3638423  0.1249940 1 25.85922  0.000000
#10 -1.0878673  0.01586862  0.8731049 -1.0380945 0 10.00000 11.236988

数据

set.seed(234)
Data <- data.frame(
   A = rnorm(1:10),
   B = rnorm(1:10),
   C = rnorm(1:10),
   D = rnorm(1:10),
   X = sample(c(1,0), size=10, replace = TRUE)
)