使用原始表名称对left_join()生成的所有列进行前缀

时间:2016-10-24 20:01:18

标签: r join relational-database dplyr

我想为左连接产生的所有列添加前缀。

当正在连接的两个表之间的名称相同时,

left_join()可以添加后缀。但是,即使它们没有相同的名称,也没有选择始终添加此后缀。而且它没有选择添加前缀。

library(dplyr)
library(nycflights13)
flights2 <- flights %>% select(year:day, hour, origin, dest, tailnum, carrier)
airports2 <- airports

result <- flights2 %>% left_join(airports2, c("dest" = "faa")) %>% head()

结果:

Source: local data frame [6 x 14]

year month   day  hour origin  dest tailnum carrier                            name
(int) (int) (int) (dbl)  (chr) (chr)   (chr)   (chr)                           (chr)
1  2013     1     1     5    EWR   IAH  N14228      UA    George Bush Intercontinental
2  2013     1     1     5    LGA   IAH  N24211      UA    George Bush Intercontinental
3  2013     1     1     5    JFK   MIA  N619AA      AA                      Miami Intl
4  2013     1     1     5    JFK   BQN  N804JB      B6                              NA
5  2013     1     1     5    LGA   ATL  N668DN      DL Hartsfield Jackson Atlanta Intl
6  2013     1     1     5    EWR   ORD  N39463      UA              Chicago Ohare Intl
Variables not shown: lat (dbl), lon (dbl), alt (int), tz (dbl), dst (chr)

在这里,只能从连接结果中知道每列的原始表格。

添加此前缀的目的是从表名和从关系数据库加载的数据的列名可靠地计算列名。例如,将使用加载到R和存储在mySQL中的数据库结构以及关系数据库的命名约定来标识主键和外键。然后,这些将用于设置连接以及稍后从连接结果中检索数据。

我发现了R的类似问题,但# add prefix before joining: names(flights2) <- paste0("flights2.", names(flights2) ) names(airports2) <- paste0("airports2.", names(airports2) ) # in join, use names with prefixes result <- flights2 %>% left_join(airports2, c("flights2.dest" = "airports2.faa") ) %>% head() 没有:

In a join, how to prefix all column names with the table it came from

2 个答案:

答案 0 :(得分:4)

实现此目的的一种简单方法是在执行连接之前将前缀添加到原始表中:

Source: local data frame [6 x 14]

flights2.year flights2.month flights2.day flights2.hour flights2.origin flights2.dest
(int)          (int)        (int)         (dbl)           (chr)         (chr)
1          2013              1            1             5             EWR           IAH
2          2013              1            1             5             LGA           IAH
3          2013              1            1             5             JFK           MIA
4          2013              1            1             5             JFK           BQN
5          2013              1            1             5             LGA           ATL
6          2013              1            1             5             EWR           ORD
Variables not shown: flights2.tailnum (chr), flights2.carrier (chr), airports2.name (chr),
airports2.lat (dbl), airports2.lon (dbl), airports2.alt (int), airports2.tz (dbl),
airports2.dst (chr)

结果:

tableName.columnName

现在,可以通过以下方式轻松引用已连接的数据框:.children("option").length

答案 1 :(得分:0)

相似的答案,但后缀为一:

library(dplyr)
(band_members
%>% rename_all( funs(paste0(., ".left")))
%>% left_join(band_instruments, by = c("name.left"="name"))
%>% rename_at( .vars = vars(-ends_with(".left")),funs(paste0(., ".right")))
) 

(band_members
%>% rename_all( funs(paste0(., ".left")))
%>% left_join(
  band_instruments%>% rename_all( funs(paste0(., ".right"))),
  by = c("name.left"="name.right"))
) 

两个都给

    #  A tibble: 3 x 3
  name.left band.left plays.right
  <chr>     <chr>     <chr>      
1 Mick      Stones    <NA>       
2 John      Beatles   guitar     
3 Paul      Beatles   bass