Question

我有一个数据框D，大约有100行，每个都设置为代表不同的彩票，如下所示：

     pA0   pA1   pA2 A0 A1 A2 
1  0.625 0.000 0.375  1 20 41
2  0.375 0.625 0.000  1 20 41
3  0.000 1.000 0.000  1 20 41
4  0.125 0.750 0.125  1 20 41
5  0.500 0.375 0.125  1 20 41
6  0.250 0.750 0.000  1 20 41
7  0.250 0.625 0.125  1 20 41
8  0.250 0.250 0.500  1 20 41
9  0.125 0.375 0.500  1 20 41
10 0.125 0.250 0.625  1 20 41
...

其中^ p变量表示从抽奖中出现具有相同后缀的结果的概率。因此，对于抽奖1，抽奖A将导致结果为1（pA0），抽奖A的概率为0％（A0）的概率为62.5％（pA1）将导致结果为20（A1），以及37.5％（pA2）的机会，即彩票A将导致结果为41（A2）。同样适用于所有其他彩票。

我想要做的是创建一个新的数据框，比如E，它会从D获取彩票，但后缀2代表最高结果且具有正概率， 1代表具有正概率的第二高结果，0代表具有正概率的最低结果。例如，第1行现在是：

     pA0   pA1    pA2 A0 A1 A2
1  0.000 0.625  0.375 20  1 41

如果抽奖有一个概率为0的结果，那么它需要排在最后（pA0，A0），如果它有多个结果且概率为0，那么它就没有＃39 ;只要具有正概率的结果具有2的等级，只要一个人在另一个上排名就很重要。

我非常确定我可以使用大量嵌套if或ifelse语句来完成此操作，但我真的很想找到一个并不需要这个的解决方案。奖励积分可以推广到每次抽奖的n结果。

Answer 1

我们使用grep创建以'p'开头的列名索引。按行循环，我们将p列与非p列相乘，得到order，用它来排列每行中的值。

E <- D
i1 <- grepl('^p', names(D))
E[] <- t(apply(D, 1, function(x) {i2 <- order(x[i1]*x[!i1])
                                  c(x[i1][i2], x[!i1][i2])}))
head(E,2)
#  pA0   pA1   pA2 A0 A1 A2
#1   0 0.625 0.375 20  1 41
#2   0 0.375 0.625 41  1 20

数据

D <- structure(list(pA0 = c(0.625, 0.375, 0, 0.125, 0.5, 0.25, 0.25, 
0.25, 0.125, 0.125), pA1 = c(0, 0.625, 1, 0.75, 0.375, 0.75, 
0.625, 0.25, 0.375, 0.25), pA2 = c(0.375, 0, 0, 0.125, 0.125, 
0, 0.125, 0.5, 0.5, 0.625), A0 = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), A1 = c(20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 20L, 
20L), A2 = c(41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L, 41L
)), .Names = c("pA0", "pA1", "pA2", "A0", "A1", "A2"), 
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

Answer 2

利用@ akrun的想法来使用apply函数，但是按值排序的非零概率结果，而非预期值。

E <-  D

# The number of columns divided by 2 is the number of outcomes
n <- ncol(tmp) / 2

E[] <- t(apply(E, 1, function(x) {

            # x is the row , first n elements are probs, the second n
            # elements are the corresponding outcomes

            uo <- c()   # vector for unordered outcomes
            up <- c()   # vector for unordered probabilities
            oo <- c()   # vector for ordered outcomes
            op <- c()   # vector for ordered probabilities

            for (i in 1:n){             # Loop through probabilities
                if( x[i] != 0){         # if probability isn't 0, it needs to be ordered
                    op <- c(op, x[i])   # add the probability to the vector
                    oo <- c(oo, x[i+n]) # add the outcome to the vector
                }
                else{                   # if the probability is 0, it isn't ordered
                    up <- c(up, x[i] )  
                    uo <- c(uo, x[i+n] )
                }
            }

            r <- order(oo)  # Order the elements of the outcomes vector that need to be ordered

            p <- c(up, op[r]) # vector of probabilites with the 0's at the back
            o <- c(uo, oo[r]) # vector of outcomes with 0 probability outcomes in the back

            c(p,o)

        }))

数据：

head(D,10)
     pA0   pA1   pA2 A0 A1 A2
1  0.625 0.000 0.375  1 20 41
2  0.375 0.625 0.000  1 20 41
3  0.000 1.000 0.000  1 20 41
4  0.125 0.750 0.125  1 20 41
5  0.500 0.375 0.125  1 20 41
6  0.250 0.750 0.000  1 20 41
7  0.250 0.625 0.125  1 20 41
8  0.250 0.250 0.500  1 20 41
9  0.125 0.375 0.500  1 20 41
10 0.125 0.250 0.625  1 20 41

head(E,10)
     pA0   pA1   pA2 A0 A1 A2
1  0.000 0.625 0.375 20  1 41
2  0.000 0.375 0.625 41  1 20
3  0.000 0.000 1.000  1 41 20
4  0.125 0.750 0.125  1 20 41
5  0.500 0.375 0.125  1 20 41
6  0.000 0.250 0.750 41  1 20
7  0.250 0.625 0.125  1 20 41
8  0.250 0.250 0.500  1 20 41
9  0.125 0.375 0.500  1 20 41
10 0.125 0.250 0.625  1 20 41

如何根据条件重新排列R中数据帧行中的元素？

2 个答案:

数据