我想知道客户Ben每天是否收到独特的第1项?在这里抓住第1项是每天独一无二的,但第2项或第3项则不是。客户可能在一天内收到两件商品1而在另一天没有收到任何商品 - 因此该公式需要正确识别该客户每天收到至少一件独特商品1。我想知道如何创建一个标记Ben每天收到唯一item1的变量。
structure(list(Customer = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "Ben", class = "factor"),
Date = structure(c(1L, 7L, 7L, 8L, 9L, 10L, 11L, 12L, 13L,
14L, 2L, 3L, 4L, 5L, 6L), .Label = c("1/1/2016", "1/10/2016",
"1/11/2016", "1/12/2016", "1/13/2016", "1/14/2016", "1/2/2016",
"1/3/2016", "1/4/2016", "1/5/2016", "1/6/2016", "1/7/2016",
"1/8/2016", "1/9/2016"), class = "factor"), Item1 = structure(c(1L,
7L, 1L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 3L, 4L, 5L,
6L), .Label = c("A1", "A10", "A11", "A12", "A13", "A14",
"A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"), class = "factor"),
Item2 = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L), .Label = c("B1", "B2", "B3", "B4", "B5",
"B6", "B7", "B8"), class = "factor"), Item3 = structure(c(1L,
7L, 8L, 7L, 9L, 10L, 11L, 12L, 13L, 14L, 2L, 3L, 4L, 5L,
6L), .Label = c("A1", "A10", "A11", "A12", "A13", "A14",
"A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"), class = "factor")), .Names = c("Customer",
"Date", "Item1", "Item2", "Item3"), class = "data.frame", row.names = c(NA,
-15L))
length(unique(x$Date)) == length(unique(x$Item1))
length(unique(x$Date)) == length(unique(x$Item2))
length(unique(x$Date)) == length(unique(x$Item3))
我编辑了数据以显示困境。我想标记Ben每天至少收到一个独特的item1。 item3略有不同,每天都没有什么不同。
工作解决方案(归功于@alistaire):
df$distinct1 <- sapply(seq_along(df$Item1), function(x){!df[x, 'Item1'] %in% df[0:(x-1), 'Item1']})
length(unique(df$Date))==length(unique(df$Date[df$distinct1==TRUE]))
df$distinct3 <- sapply(seq_along(df$Item3), function(x){!df[x, 'Item3'] %in% df[0:(x-1), 'Item3']})
length(unique(df$Date))==length(unique(df$Date[df$distinct3==TRUE]))