R数据帧根据其他列组合数据框中的列

时间:2018-05-02 21:58:17

标签: r dataframe

我的数据框如下:

df <- tibble::tribble(~home, ~visitor, ~hcountry, ~vcountry,
"Milan", "Manchester", "ITA", "ENG",
"LIVERPOOL", "MILAN", "ENG", "ITA",
"Real Madrid", "Juventus", "SPA", "ITA")

#> # A tibble: 3 x 4
#>   home        visitor    hcountry vcountry
#>   <chr>       <chr>      <chr>    <chr>   
#> 1 Milan       Manchester ITA      ENG     
#> 2 LIVERPOOL   MILAN      ENG      ITA     
#> 3 Real Madrid Juventus   SPA      ITA 

并且只想获得意大利队,即:米兰,米兰,尤文图斯......如果不使用循环,怎么可能?

3 个答案:

答案 0 :(得分:1)

首先,我推荐一个基本的R教程,以熟悉基本的R数据操作,如子集等。请参阅CRAN上的R for Beginners

在您的情况下,您可以这样做:

df[df$hcountry == "ITA" | df$vcountry == "ITA", ]
#    home    visitor hcountry vcountry
#1       Milan Manchester      ITA      ENG
#2   LIVERPOOL      MILAN      ENG      ITA
#3 Real Madrid   Juventus      SPA      ITA

或者

subset(df, hcountry == "ITA" | vcountry == "ITA")

样本数据

df <- read.table(text =
    "home  visitor  hcountry vcountry
Milan Manchester ITA ENG
LIVERPOOL MILAN ENG ITA
'Real Madrid' Juventus SPA ITA", header  =T)

答案 1 :(得分:0)

或者,您可以尝试堆叠主页和访问者国家/地区以查找唯一值

library(dplyr)
library(tidyr)

df %>% gather(key1, country, -c(home, visitor)) %>% 
  gather(key2, team, -c(key1, country)) %>% 
  mutate_at(vars(key1, key2), substr, start=1, stop=1) %>% 
  filter(key1==key2) %>% select(-key1, -key2) %>% 
  mutate(team=tools::toTitleCase(tolower(team))) %>% 
  filter(country=="ITA") %>% 
  distinct()

#> # A tibble: 2 x 2
#>   country team    
#>   <chr>   <chr>   
#> 1 ITA     Milan   
#> 2 ITA     Juventus

如果您希望看到米兰值重复,请删除上一个distinct()

答案 2 :(得分:0)

我们可以使用filter

中的dplyr
library(dplyr)
df %>%
    filter(hcountry == "ITA" | vcountry == "ITA")