每日独特计数

时间:2016-05-06 19:40:26

标签: r

我想知道客户Ben每天是否收到独特的第1项?在这里抓住第1项是每天独一无二的,但第2项或第3项则不是。客户可能在一天内收到两件商品1而在另一天没有收到任何商品 - 因此该公式需要正确识别该客户每天收到至少一件独特商品1。我想知道如何创建一个标记Ben每天收到唯一item1的变量。

enter image description here

structure(list(Customer = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ben", class = "factor"), 
    Date = structure(c(1L, 7L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 
    14L, 2L, 3L, 4L, 5L, 6L), .Label = c("1/1/2016", "1/10/2016", 
    "1/11/2016", "1/12/2016", "1/13/2016", "1/14/2016", "1/2/2016", 
    "1/3/2016", "1/4/2016", "1/5/2016", "1/6/2016", "1/7/2016", 
    "1/8/2016", "1/9/2016"), class = "factor"), Item1 = structure(c(1L, 
    7L, 1L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 3L, 4L, 5L, 
    6L), .Label = c("A1", "A10", "A11", "A12", "A13", "A14", 
    "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"), class = "factor"), 
    Item2 = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 
    4L, 5L, 6L, 7L, 8L), .Label = c("B1", "B2", "B3", "B4", "B5", 
    "B6", "B7", "B8"), class = "factor"), Item3 = structure(c(1L, 
    7L, 8L, 7L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 3L, 4L, 5L, 
    6L), .Label = c("A1", "A10", "A11", "A12", "A13", "A14", 
    "A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"), class = "factor")), .Names = c("Customer", 
"Date", "Item1", "Item2", "Item3"), class = "data.frame", row.names = c(NA, 
-15L))

length(unique(x$Date)) == length(unique(x$Item1))
length(unique(x$Date)) == length(unique(x$Item2))
length(unique(x$Date)) == length(unique(x$Item3))

我编辑了数据以显示困境。我想标记Ben每天至少收到一个独特的item1。 item3略有不同,每天都没有什么不同。

工作解决方案(归功于@alistaire):

df$distinct1 <- sapply(seq_along(df$Item1), function(x){!df[x, 'Item1'] %in% df[0:(x-1), 'Item1']})
length(unique(df$Date))==length(unique(df$Date[df$distinct1==TRUE]))

df$distinct3 <- sapply(seq_along(df$Item3), function(x){!df[x, 'Item3'] %in% df[0:(x-1), 'Item3']})
length(unique(df$Date))==length(unique(df$Date[df$distinct3==TRUE]))

0 个答案:

没有答案