Question

我正在尝试使用dplyr或reshape2软件包在R中创建简单的数据透视表，因为我的数据集太大而R使用sqldf内存不足。我想要制作数据透视表的我的数据集的两列是"Product"和"Cust_Id"。我想计算每个产品的客户数量。这就是我得到的。

library(reshape2)
mydata<-read.table("Book1.txt",header=TRUE,fill=TRUE)
mydata.m<-melt(mydata,id=c("Product"),measured=c(Cust_Id))
mydata.d<-dcast(mydata.m,Product~variable,count)

返回

Error in UseMethod("group_by_"):
no applicable method for 'group_by_' applied to an object of class "c('integer','numeric')"

我还尝试使用以下代码dplyr（虽然我在另一台笔记本电脑上执行此操作但不确定最后一步）

library(dplyr)
mydata.df<-tbl_df(mydata)
summarize(mydata.df,Product,Cust_Id=n())

我没有收到任何错误消息，但输出中似乎缺少很多值。我非常感谢你的意见。提前谢谢。

Answer 1

试试这个：

library(dplyr)
mydata <- mydata %>%
  group_by(Product) %>%
  summarise(nCustomers = n())

或者，如果您只想计算唯一客户，您可以这样做：

library(dplyr)
mydata <- mydata %>%
  group_by(Product) %>%
  summarise(nCustomers = n_distinct(Cust_Id))

Answer 2

如果这确实是一个大数据集，那么import kotlin.test.assertEquals import org.jetbrains.spek.api.Spek class BlaherSpecs: Spek() { init { given("Let's test Blaher") { var blaher = Blaher() on("Blaher blah") { val blah = blaher.blah() it("should be Blah!") { assertEquals("Blah1!", blah) } } } }}包中的最佳选择

data.table

R：使用dplyr或reshape2包制作数据透视表

2 个答案: