Question

我想尝试跨列的NFL数据的3场比赛滚动平均值，这里是数据和结果数据框：

数据：

Player <- c("Player1", "Player2", "Player3", "Player4", "Player5")
Week1 <- c(10, 5, 6, 8, 7)
Week2 <- c(12, 9, 4, 2, 8)
Week3 <- c(4, 5, 4, 3, 12)
Week4 <- c(15, 7, 12, NA, 5)
Week5 <- c(NA, 5, 8, 11, 6)
q <- data.frame(Player, Week1, Week2, Week3, Week4, Week5)

数据框：

   Player Week1 Week2 Week3 Week4 Week5
1 Player1    10    12     4    15    NA
2 Player2     5     9     5     7     5
3 Player3     6     4     4    12     8
4 Player4     8     2     3    NA    11
5 Player5     7     8    12     5     6

所以我想要做的是从第1周开始，在整个星期内进行3场比赛的滚动平均值。因此对于球员来说，它会平均为第1周，第2周，第3周并在新列中给出我的值，然后它会平均第2周，第3周，第4周，并在新列等中给我这个值......

在这种情况下，新数据框应如下所示：

   Player Week1 Week2 Week3 Week4 Week5    Avg1    Avg2    Avg3
1 Player1    10    12     4    15    NA     8.7    10.3     NA
2 Player2     5     9     5     7     5     6.3     7.0     5.7
3 Player3     6     4     4    12     8     4.7     6.7     8.0
4 Player4     8     2     3    NA    11     4.3     4.3     5.3
5 Player5     7     8    12     5     6     9.0     8.3     7.7

请注意，对于第4周的Player4，有一个被忽视的NA ...这将是玩家由于某种原因没有玩的一周，所以我使用前两个游戏和之后的游戏Avg3。

我需要这些新列，因为我将运行回归以查看3的平均值是否预测下一个值。我能找到的所有关于这一点的滚动平均值只有一列，但我很缺乏经验，所以对于像这样的问题的数据格式化的任何帮助都表示赞赏。在此先感谢您的帮助！

Answer 1

我们可以使用void swap_node(struct node **head_ref,int key1,int key2) // function to swap two nodes. { if(key1==key2) return; // search for key1 struct node *prevx = NULL, *currx = *head_ref; while(currx && currx->data != key1) { prevx = currx; currx = currx->next; } //search for key2 struct node *prevy = NULL, *curry = *head_ref; while(curry && curry->data!=key2) { prevy = curry; curry = curry->next; } // if key1 or key2 are not present in the list if(currx == NULL || curry == NULL) return; // check if key1 is not head of the list if(prevx != NULL) prevx->next = curry; else *head_ref = curry; // then make key2 the head // check if key2 is not head of the list if(prevy != NULL) prevy->next = currx; else *head_ref = currx; // then make key2 the head // swapping the next pointers of the nodes struct node *temp = curry->next; curry->next = currx->next; currx->next = temp; }包

中的rollmean

zoo

最后，要获得合并的数据框，

library(zoo)
t(apply(q[-1], 1, function(x) rollmean(x, 3))))


#       Week2     Week3    Week4
#[1,] 8.666667 10.333333       NA
#[2,] 6.333333  7.000000 5.666667
#[3,] 4.666667  6.666667 8.000000
#[4,] 4.333333        NA       NA
#[5,] 9.000000  8.333333 7.666667

如果您具体了解列名，可以随时通过

进行更改

cbind(q, t(apply(q[-1], 1, function(x) rollmean(x, 3))))

#   Player Week1 Week2 Week3 Week4 Week5    Week2     Week3    Week4
#1 Player1    10    12     4    15    NA 8.666667 10.333333       NA
#2 Player2     5     9     5     7     5 6.333333  7.000000 5.666667
#3 Player3     6     4     4    12     8 4.666667  6.666667 8.000000
#4 Player4     8     2     3    NA    11 4.333333        NA       NA
#5 Player5     7     8    12     5     6 9.000000  8.333333 7.666667

然后在temp <- t(apply(q[-1], 1, function(x) rollmean(x, 3))) colnames(temp) <- c("avg1", "avg2", "avg3")

上使用cbind

修改

回答OP的一些问题 -

如果您在开始时要删除多个列，则只需选择/取消选择具有索引编号的列

例如，

要取消选择前两列，您可以使用等于temp的{{1}}，为其提供一系列取消选择/选择的值。

q[-c(1:2)]被称为匿名函数，您可以使用它将自己的函数应用于数据框的每一行。

q[3:7]无法处理function(x)个值。来自rollmean
的文件

rollmean的默认方法不处理包含NA的输入。在这种情况下，请改用rollapply。

R语言 - 跨列滚动平均值

1 个答案: