我有一个数据框(df),其中有列日期,获胜者,失败者,WinnerRank和Loserrank。优胜者排名是专栏优胜者的排名,失败者排名也是如此。 我想要一个带有日期,名称和等级的新数据框。但是问题是我想要的名字既可以出现在赢家中,也可以出现在输家中。如果我想要的名称在获胜者栏中,我希望获得Winnerrank,但是如果名称在失败者中,则希望拥有失败者排名。我该怎么做?
df看起来像这样:
Date Winner Loser WRank LRank
1 2000-01-03 Federer R. Knippschild J. 65 87
2 2000-01-03 Enqvist T. Federer R. 5 65
3 2000-01-10 Ferrero J.C. Federer R. 45 61
4 2000-01-17 Federer R. Chang M. 62 38
5 2000-01-17 Federer R. Kroslak J. 62 104
6 2000-01-17 Clement A. Federer R. 54 62
我想要的格式如下:
Date Name Rank
1 2000-01-03 Federer R. 65
2 2000-01-03 Federer R. 65
3 2000-01-10 Federer R. 61
4 2000-01-17 Federer R. 62
5 2000-01-17 Federer R. 62
6 2000-01-17 Federer R. 62
答案 0 :(得分:1)
我们可以使用 tidyverse
包中的功能:
library(tidyverse)
dat %>%
# create single winner and loser columns,
# concatenating name and rank together
unite(Winner, Winner, WRank, sep = "-") %>%
unite(Loser, Loser, LRank, sep = "-") %>%
# pivot to be "tall"
pivot_longer(cols = c("Winner", "Loser")) %>%
select(-name) %>%
# reverse concatentation
separate(value, into = c("Name", "Rank"), sep = "-")
# Date Name Rank
# 1 2000-01-03 Federer_R. 65
# 2 2000-01-03 Knippschild J. 87
# 3 2000-01-03 Enqvist_T. 5
# 4 2000-01-03 Federer R. 65
# 5 2000-01-10 Ferrero_J.C. 45
# 6 2000-01-10 Federer R. 61
# 7 2000-01-17 Federer_R. 62
# 8 2000-01-17 Chang M. 38
# 9 2000-01-17 Federer_R. 62
#10 2000-01-17 Kroslak J. 104
#11 2000-01-17 Clement_A. 54
#12 2000-01-17 Federer R. 62
要注意的一件事是,这会将Rank
转换为字符值。您可以使用as.numeric
函数将其反转。
答案 1 :(得分:0)
可能是一个有助于根据播放器名称提取这些行和值的函数。我们filter
在“获胜者”或|
“失败者”列中的玩家名称所在的行,然后使用transmute
通过选择“日期”,“名称”作为输入玩家名称,通过比较列“获胜者”,“失败者”和玩家名称的子集来创建逻辑矩阵来创建“排名”,并将输出馈送到max.col
中以获取最大值的索引即每行TRUE => 1和FALSE => 0,cbind
具有行索引(row_number
),并使用它从具有'WRank'和'LRank'列的数据集的子集中提取相应的元素
f1 <- function(dat, nm) {
dat %>%
filter(Winner == nm|Loser == nm) %>%
transmute(Date, Name = nm,
Rank = .[c('WRank', 'LRank')][cbind(row_number(),
max.col(.[c('Winner', 'Loser')] == nm))])
}
-测试
f1(df1, 'Federer R.')
# Date Name Rank
#1 2000-01-03 Federer R. 65
#2 2000-01-03 Federer R. 65
#3 2000-01-10 Federer R. 61
#4 2000-01-17 Federer R. 62
#5 2000-01-17 Federer R. 62
#6 2000-01-17 Federer R. 62
df1 <- structure(list(Date = c("2000-01-03", "2000-01-03", "2000-01-10",
"2000-01-17", "2000-01-17", "2000-01-17"), Winner = c("Federer R.",
"Enqvist T.", "Ferrero J.C.", "Federer R.", "Federer R.", "Clement A."
), Loser = c("Knippschild J.", "Federer R.", "Federer R.", "Chang M.",
"Kroslak J.", "Federer R."), WRank = c(65L, 5L, 45L, 62L, 62L,
54L), LRank = c(87L, 65L, 61L, 38L, 104L, 62L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
答案 2 :(得分:0)
您可以将获胜者和失败者合并为一列,并排成一列,然后选择玩家名称。
library(dplyr)
player_name <- 'Federer R.'
df %>%
rename_with(~paste0(., '_Name'), c(Winner, Loser)) %>%
rename_with(~paste0(., '_Rank'), ends_with('Rank')) %>%
tidyr::pivot_longer(cols = -Date,
names_pattern = '.*_(\\w+)',
names_to = '.value') %>%
filter(Name == player_name)
# Date Name Rank
# <chr> <chr> <int>
#1 2000-01-03 Federer R. 65
#2 2000-01-03 Federer R. 65
#3 2000-01-10 Federer R. 61
#4 2000-01-17 Federer R. 62
#5 2000-01-17 Federer R. 62
#6 2000-01-17 Federer R. 62