Question

下面的代码使用内部联接合并两个表时出现错误

table_Left<-matrix(c(1:6,rep("Toaster",3),rep("Radio",3)),ncol = 2)
colnames(table_Left)<-c("Customer_ID","Product")
table_Left<-as.table(table_Left)
table_Left
table_Right<-matrix(c(2,4,6,rep("Alabama",2),"Ohio"),ncol = 2)
colnames(table_Right)<-c("Customer_ID","State")
table_Right<-as.table(table_Right)
table_Right
merge(x=table_Left, y=table_Right, by="Customer_ID")

错误： fix.by（by.x，x）中的错误：“ by”必须指定唯一有效的列

请告知更正

Answer 1

我认为您的问题是由于对“表格”一词的混淆所致。使用R进行数据科学时，数据帧是一种非常常见的对象。并且，用通用语言，它们通常被称为“表”。但是，您使用的功能as.table()与数据帧没有任何关系：as.table()创建列联表（在这里根本不是您想要的）。

创建2个数据框（或“表”）的最有效方法是直接使用函数data.frame()创建它们：

df_Left <- data.frame(
  Customer_ID = 1:6,
  Product = c(rep("Toaster", 3), rep("Radio", 3))
)

df_Left

      Customer_ID Product
    1           1 Toaster
    2           2 Toaster
    3           3 Toaster
    4           4   Radio
    5           5   Radio
    6           6   Radio

df_Right <- data.frame(
  Customer_ID = c(2, 4, 6),
  State = c(rep("Alabama", 2), "Ohio")
)

df_Right

      Customer_ID   State
    1           2 Alabama
    2           4 Alabama
    3           6    Ohio

然后您的带有merge()函数的代码将起作用：

merge(x = df_Left, y = df_Right, by = "Customer_ID")

  Customer_ID Product   State
1           2 Toaster Alabama
2           4   Radio Alabama
3           6   Radio    Ohio

现在，您的代码从创建矩阵开始。如果您有充分的理由在您的情况下使用矩阵，merge()也可以使用：

如果您查看merge()函数的帮助文件（带有?merge），则会看到：

合并（x，y，...）

x，y：数据帧或强制转换为一个的对象。

并且矩阵可以强制到数据帧而不会对数据造成任何问题。因此，您也可以这样做：

ma_Left <- matrix(
  c(1:6, rep("Toaster", 3), rep("Radio", 3)), ncol = 2
)

colnames(ma_Left) <- c("Customer_ID", "Product")

ma_Right <- matrix(
  c(2, 4, 6, rep("Alabama", 2), "Ohio"), ncol = 2
)

colnames(ma_Right) <- c("Customer_ID", "State")

merge(x = ma_Left, y = ma_Right, by = "Customer_ID")

  Customer_ID Product   State
1           2 Toaster Alabama
2           4   Radio Alabama
3           6   Radio    Ohio

在R中使用内部Join合并

1 个答案: