Question

我想基于Date计算基于列的分类变量的唯一值。

我希望结果作为矩阵，其中列名是值分类变量，行名称将是唯一的日期值，它们的单元格值是唯一计数。

以下链接解决了问题，但我正在寻找转换后的df：

How to add count of unique values by group to R data.frame

R: Extract unique values in one column grouped by values in another column

我的df有超过50,000行，看起来像：

dat <- data.frame(Date = c('06/08/2018','06/08/2018','07/08/2018','07/08/2018','08/08/2018','09/08/2018','09/08/2018','11/08/2018','11/08/2018','13/08/2018'),
                  Type= c('A','B','C','A','B','A','A','B','C','C'))

我希望我的结果矩阵将“A”，“B”，“C”作为新列，“Date”作为矩阵中的行和值作为唯一计数，如下图所示：

另外，我们不会对分类值进行硬编码会很棒。因此，将来如果不是3就变为4，那么代码会自动处理它。

Answer 1

如何使用table ...

mat <- table(dat$Date, dat$Type)

mat

             A B C
  06/08/2018 1 1 0
  07/08/2018 1 0 1
  08/08/2018 0 1 0
  09/08/2018 2 0 0
  11/08/2018 0 1 1
  13/08/2018 0 0 1

Answer 2

您要找的是dcast()：

dcast(dat, Date ~ Type, fun.aggregate = length, value.var = "Type")

此功能会根据fun.aggregate参数（在您的情况下为length()）快速汇总数据。

Answer 3

这使用spread

library(tidyverse)

spread_data <- (data, key = type, value = 2)

将DF列值转换为R中的矩阵

3 个答案: