Question

我正在尝试将两个数据表与一个我无法转换为R代码的规则合并。

让我们假设我们正在与客户打交道：每个客户可以属于一个或多个类别，每个类别都可以购买某个产品子集。

然后我要合并两个数据帧，即

df1
customer   category
Anthony    X
Anthony    Y
Beatrix    Y
Charles    Z

df2
product    category
item1      X
item2      Y
item3      Y
item3      Z

df3 = required merge of (df1, df2)
customer   product
Anthony    item1
Anthony    item2
Anthony    item3
Beatrix    item2
Beatrix    item3
Charles    item3

感谢您的帮助！

Answer 1

根据您的示例，我将其理解为将与每个类别相关联的所有产品加入到每个客户的类别中。以下内容适用于这种情况：

生成数据：

df1 <- read.table(header = T, text = "customer   category
Anthony    X
Anthony    Y
Beatrix    Y
Charles    Z")

df2 <- read.table(header = T, text = "product    category
item1      X
item2      Y
item3      Y
item3      Z")

dplyr包解决方案：

library(dplyr)
left_join(df1, df2) %>% select(-category)

  customer product
1  Anthony   item1
2  Anthony   item2
3  Anthony   item3
4  Beatrix   item2
5  Beatrix   item3
6  Charles   item3

编辑基础包中的替代解决方案（由lmo建议）：

merge(df1, df2, by="category")[-1]

  customer product
1  Anthony   item1
2  Anthony   item2
3  Anthony   item3
4  Beatrix   item2
5  Beatrix   item3
6  Charles   item3

R：基于自定义规则合并数据表

1 个答案: