匹配并合并两个数据框

时间:2018-11-07 03:22:25

标签: r

我要合并和匹配以下CSV文件。

CSV 1

Year    Qrtrs   BD  BS  BY  All
1950    JAS     0   1   0   1
1950    OND     0   2   1   3
1951    JAS     1   0   4   5

CSV 2

Year    JFM AMJ JAS OND
1950    LN  LN  NN  LN
1951    LN  NN  EN  EN

并且想得到事情。

Year    Qrtrs   CASE    BD  BS  BY  All
1950    JAS       NN    0   1   0   1
1950    OND       LN    0   2   1   3
1951    JAS       EN    1   0   4   5

我是R的新手。谢谢您的帮助。

而且,我想将剩余的Qrtrs添加到最终的CSV中,然后在BD,BS,BY,ALL中仅输入“ 0”。请参阅下面。

Year    Qrtrs   CASE    BD  BS  BY  All
1950    JAS       NN    0   1   0   1
1950    OND       LN    0   2   1   3
1950    AMJ       LN    0   0   0   0
1950    JFM       LN    0   0   0   0

谢谢。!

1 个答案:

答案 0 :(得分:2)

数据

df1 <- read.table(
  text = "Year    Qrtrs   BD  BS  BY  All
          1950    JAS     0   1   0   1
          1950    OND     0   2   1   3
          1951    JAS     1   0   4   5",
  header = T, stringsAsFactors = F
)

df2 <- read.table(
  text = "Year    JFM AMJ JAS OND
          1950    LN  LN  NN  LN
          1951    LN  NN  EN  EN",
  header = T, stringsAsFactors = F
)

我提供了一种使用gather{tidyr}left_join{dplyr}的方法:

library(tidyr)
library(dplyr)

df2.2 <- gather(df2, key = "Qrtrs", value = "CASE", - Year)

df2.2

#   Year Qrtrs CASE
# 1 1950   JFM   LN
# 2 1951   JFM   LN
# 3 1950   AMJ   LN
# 4 1951   AMJ   NN
# 5 1950   JAS   NN
# 6 1951   JAS   EN
# 7 1950   OND   LN
# 8 1951   OND   EN

left_join(df1, df2.2)

#   Year Qrtrs BD BS BY All CASE
# 1 1950   JAS  0  1  0   1   NN
# 2 1950   OND  0  2  1   3   LN
# 3 1951   JAS  1  0  4   5   EN

此外,这是一种基本方法:

df2.stack <- stack(df2, select = -Year)
df2.2 <- cbind(df2$Year, df2.stack)
names(df2.2) <- c("Year", "CASE", "Qrtrs")
merge(df1, df2.2, by = c("Year", "Qrtrs"))