假设我有一个包含ID,性别和几个数值变量的df。见下文
set.seed(123)
ID <- c(1,2,3,4,5,6,7,8,9,10)
gender <- c("m", "m", "m", "f", "f", "m", "m", "f", "f", "m")
x1 <- rnorm(10, 0, 1)
x2 <- rnorm(10, 0, 1)
x3 <- rnorm(10, 0, 1)
x4 <- rnorm(10, 0, 1)
x5 <- rnorm(10, 0, 1)
df <- data.frame(ID, gender, x1, x2, x3, x4, x5)
目标是创建两列:Max1和Max2,其中
MAX1是(x1,x2,x3,x4,x5)最大最大值的变量名。
MAX2是第二大最大值(x1,x2,x3,x4,x5)的变量名
所以我需要为df
中的每一行找到MAX1和MAX2EX:对于ID = 1,MAX1 =“x2”且MAX2 =“x4”
答案 0 :(得分:1)
这是一个简单的解决方案:
maxes <- t(sapply(1:nrow(df), function(i) {
names(sort(df[i,3:7], decreasing=T)[1:2])
}))
colnames(maxes) <- c("MAX1","MAX2")
df <- cbind(df, maxes)
ID gender x1 x2 x3 x4 x5
1 1 m -0.56047565 1.2240818 -1.0678237 0.42646422 -0.69470698
2 2 m -0.23017749 0.3598138 -0.2179749 -0.29507148 -0.20791728
3 3 m 1.55870831 0.4007715 -1.0260044 0.89512566 -1.26539635
4 4 f 0.07050839 0.1106827 -0.7288912 0.87813349 2.16895597
5 5 f 0.12928774 -0.5558411 -0.6250393 0.82158108 1.20796200
6 6 m 1.71506499 1.7869131 -1.6866933 0.68864025 -1.12310858
7 7 m 0.46091621 0.4978505 0.8377870 0.55391765 -0.40288484
8 8 f -1.26506123 -1.9666172 0.1533731 -0.06191171 -0.46665535
9 9 f -0.68685285 0.7013559 -1.1381369 -0.30596266 0.77996512
10 10 m -0.44566197 -0.4727914 1.2538149 -0.38047100 -0.08336907
MAX1 MAX2 MAX1 MAX2
1 1.224082 0.4264642 x2 x4
2 0.3598138 -0.2079173 x2 x5
3 1.558708 0.8951257 x1 x4
4 2.168956 0.8781335 x5 x4
5 1.207962 0.8215811 x5 x4
6 1.786913 1.715065 x2 x1
7 0.837787 0.5539177 x3 x4
8 0.1533731 -0.06191171 x3 x4
9 0.7799651 0.7013559 x5 x2
10 1.253815 -0.08336907 x3 x5