我的数据框如下:
df <- tibble::tribble(~home, ~visitor, ~hcountry, ~vcountry,
"Milan", "Manchester", "ITA", "ENG",
"LIVERPOOL", "MILAN", "ENG", "ITA",
"Real Madrid", "Juventus", "SPA", "ITA")
#> # A tibble: 3 x 4
#> home visitor hcountry vcountry
#> <chr> <chr> <chr> <chr>
#> 1 Milan Manchester ITA ENG
#> 2 LIVERPOOL MILAN ENG ITA
#> 3 Real Madrid Juventus SPA ITA
并且只想获得意大利队,即:米兰,米兰,尤文图斯......如果不使用循环,怎么可能?
答案 0 :(得分:1)
首先,我推荐一个基本的R教程,以熟悉基本的R数据操作,如子集等。请参阅CRAN上的R for Beginners。
在您的情况下,您可以这样做:
df[df$hcountry == "ITA" | df$vcountry == "ITA", ]
# home visitor hcountry vcountry
#1 Milan Manchester ITA ENG
#2 LIVERPOOL MILAN ENG ITA
#3 Real Madrid Juventus SPA ITA
或者
subset(df, hcountry == "ITA" | vcountry == "ITA")
df <- read.table(text =
"home visitor hcountry vcountry
Milan Manchester ITA ENG
LIVERPOOL MILAN ENG ITA
'Real Madrid' Juventus SPA ITA", header =T)
答案 1 :(得分:0)
或者,您可以尝试堆叠主页和访问者国家/地区以查找唯一值
library(dplyr)
library(tidyr)
df %>% gather(key1, country, -c(home, visitor)) %>%
gather(key2, team, -c(key1, country)) %>%
mutate_at(vars(key1, key2), substr, start=1, stop=1) %>%
filter(key1==key2) %>% select(-key1, -key2) %>%
mutate(team=tools::toTitleCase(tolower(team))) %>%
filter(country=="ITA") %>%
distinct()
#> # A tibble: 2 x 2
#> country team
#> <chr> <chr>
#> 1 ITA Milan
#> 2 ITA Juventus
如果您希望看到米兰值重复,请删除上一个distinct()
答案 2 :(得分:0)
我们可以使用filter
dplyr
library(dplyr)
df %>%
filter(hcountry == "ITA" | vcountry == "ITA")