我有csv文件,其格式如下:
Player Sports Win Loss
Brian Football 5 3
Brian Basketball 4 1
Brian Bowling 7 0
Chris Football 3 3
Chris Basketball 3 4
. . . .
. . . .
我想将格式更改为以下内容:
Name&Sports Win Loss Total
Brian 16 4 20
Football 5 3 8
Basketball 4 1 5
Bowling 7 0 7
Chris 6 7 13
Football 3 3 6
Basketball 3 4 7
. . . .
. . . .
基本上,在新格式中,我们首先写下该人的姓名以及在该人玩过的所有体育比赛中所获得的胜利,损失和比赛的总数。在接下来的行中,我们会记录所玩的人的每项运动,以及在该特定运动中进行的胜利,损失和比赛的总数。一旦我们为那个人写了一切,我们就会转向下一个人并做同样的事情。
在R中有一种简单的方法吗?
答案 0 :(得分:3)
df <- read.table(text = "Player Sports Win Loss
Brian Football 5 3
Brian Basketball 4 1
Brian Bowling 7 0
Chris Football 3 3
Chris Basketball 3 4",header=T)
tmp <- aggregate(df$Win,by=list(df$Player),sum)
tmp <- cbind(tmp, aggregate(df$Loss,by=list(df$Player),sum)[2])
names(tmp) <- colnames(df)[2:4]
df <- rbind(df[,2:ncol(df)], tmp)
df$Total <- df$Loss + df$Win
df
Sports Win Loss Total 1 Football 5 3 8 2 Basketball 4 1 5 3 Bowling 7 0 7 4 Football 3 3 6 5 Basketball 3 4 7 6 Brian 16 4 20 7 Chris 6 7 13
或者,如果匹配示例中的行顺序很重要:
df <- rbind(tmp[1,], df[1:3,2:ncol(df)],
tmp[2,], df[4:nrow(df),2:ncol(df)]) # could easily be made more programmatic
df$Total <- df$Loss + df$Win
df
Sports Win Loss Total 1 Brian 16 4 20 2 Football 5 3 8 3 Basketball 4 1 5 4 Bowling 7 0 7 21 Chris 6 7 13 41 Football 3 3 6 5 Basketball 3 4 7
答案 1 :(得分:2)
来自tidyverse
的解决方案。 dt_final
是最终输出。
# Create example data frame
dt <- read.table(text = "Player Sports Win Loss
Brian Football 5 3
Brian Basketball 4 1
Brian Bowling 7 0
Chris Football 3 3
Chris Basketball 3 4",
header = TRUE, stringsAsFactors = FALSE)
# Load package
library(tidyverse)
# Split data frame by players
dt_list <- split(dt, f = dt$Player)
# Define a funciton to process data
sum_fun <- function(dt){
playername <- unique(dt$Player)
dt1 <- dt %>%
mutate(Total = Win + Loss) %>%
select(-Player)
dt2 <- data_frame(Sports = playername,
Win = sum(dt1$Win),
Loss = sum(dt1$Loss),
Total = sum(dt1$Total))
dt3 <- bind_rows(dt2, dt1)
return(dt3)
}
# Apply the function
dt_final <- dt_list %>%
map_df(sum_fun) %>%
bind_rows() %>%
rename(`Name&Sports` = Sports)