我在第1列有一个日期表(标题为#34;日期"列)和第2到第5列中的值(标题为&#34的列; A" - >" D& #34;。)
Date A B C D
1/1/16 12 75 38 88
1/2/16 32 76 44 34
等
我需要创建包含以下内容的其他第6列E:
每行:
如果A列中的值> B>然后C列E = X
如果A列中的值<&lt; B&lt;然后C列E = Y
列E = Z
中的任何其他值这样做的最佳方式是什么?
答案 0 :(得分:1)
# Here I'm simulating your original dataset
df <- data.frame(Date=seq(Sys.Date(),Sys.Date()+9,by=1), A = seq(1,20,2),
B = rep(10,1,1), C=abs(rnorm(10)), D = rnorm(10))
# Create E
df$E <- NA
df$E[df$A > df$B & df$B > df$C] <- "X"
df$E[df$A < df$B & df$B < df$C] <- "Y"
df$E[is.na(df$E)] <- "Z"
df
Date A B C D E
1 2016-06-29 1 10 0.5833273005 -0.25244803522 Z
2 2016-06-30 3 10 0.4291374487 0.01669504752 Z
3 2016-07-01 5 10 1.7079045597 1.28413741595 Z
4 2016-07-02 7 10 0.2286708311 1.16421926818 Z
5 2016-07-03 9 10 0.6216853471 1.08934300378 Z
6 2016-07-04 11 10 1.4662821456 -0.58322427720 X
7 2016-07-05 13 10 0.8255102263 0.65217873906 X
8 2016-07-06 15 10 1.6185672627 0.04195996408 X
9 2016-07-07 17 10 0.6752993011 -2.31746231694 X
10 2016-07-08 19 10 0.2901133125 0.97969860678 X
# Create E only for a subset of rows, like 6:10
df$E <- NA
df$E[1:5] <- "nothing applied to this row"
df$E[df$A > df$B & df$B > df$C & 6:10] <- "X"
df$E[df$A < df$B & df$B < df$C & 6:10] <- "Y"
df$E[is.na(df$E) & 6:10] <- "Z"
df
Date A B C D E
1 2016-06-29 1 10 0.5833273005 -0.25244803522 nothing applied to this row
2 2016-06-30 3 10 0.4291374487 0.01669504752 nothing applied to this row
3 2016-07-01 5 10 1.7079045597 1.28413741595 nothing applied to this row
4 2016-07-02 7 10 0.2286708311 1.16421926818 nothing applied to this row
5 2016-07-03 9 10 0.6216853471 1.08934300378 nothing applied to this row
6 2016-07-04 11 10 1.4662821456 -0.58322427720 X
7 2016-07-05 13 10 0.8255102263 0.65217873906 X
8 2016-07-06 15 10 1.6185672627 0.04195996408 X
9 2016-07-07 17 10 0.6752993011 -2.31746231694 X
10 2016-07-08 19 10 0.2901133125 0.97969860678 X
答案 1 :(得分:1)
我认为这应该可行:
set.seed(1)
myframe = data.frame(date=1:10, a=sample(1:10), b=sample(1:10), c=sample(1:10), d=sample(1:10), e=NA)
myframe[myframe$a > myframe$b & myframe$b > myframe$c, "e"] = "x"
myframe[myframe$a < myframe$b & myframe$b < myframe$c, "e"] = "y"
myframe[is.na(myframe$e), "e"] = "z"
myframe
给予
date a b c d e 1 1 3 3 10 5 z 2 2 4 2 2 6 z 3 3 5 6 6 4 z 4 4 7 10 1 2 z 5 5 2 5 9 10 y 6 6 8 7 8 8 z 7 7 9 8 7 9 x 8 8 6 4 5 1 z 9 9 10 1 3 7 z 10 10 1 9 4 3 z
如果x <- 1:4
提供1 2 3 4
,则x < - 1:4 < 3
为TRUE TRUE FALSE FALSE
。因此,someFrame[x, "someCol"]
从x为TRUE的行中选择col,即第一行和第二行。同样适用于向量,因此c("a", "b", "c", "d")[x]
返回a b
。我已经听说过这个名为“#34;逻辑索引&#34;”的价值。