我有一个销售数月的数据集,我需要找到停止购买的客户。
Clients Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Client 1 123 768 678 452 213 123 55 10 0 0 0 0
Client 2 549 542 21 321 31 59 998 0 546 980 0 987
Client 3 500 0 500 0 500 0 500 0 500 0 500 0
Client 4 126 545 2315 268 126 56 0 0 0 0 0
Client 5 546 546 0 0 0 328 486 326 0 0 66 0
Client 6 0 0 0 25 78 563 698 631 230 53 0 0
所以,我假设客户端1和客户端4停止了与我们合作,我怎么能找到它们?或者我怎样才能找到超过3个连续零的行?
答案 0 :(得分:1)
#Had to fix Client 4, one number was missing
DF <- read.table(text = 'Clients Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
"Client 1" 123 768 678 452 213 123 55 10 0 0 0 0
"Client 2" 549 542 21 321 31 59 998 0 546 980 0 987
"Client 3" 500 0 500 0 500 0 500 0 500 0 500 0
"Client 4" 126 545 2315 27 268 126 56 0 0 0 0 0
"Client 5" 546 546 0 0 0 328 486 326 0 0 66 0
"Client 6" 0 0 0 25 78 563 698 631 230 53 0 0', header = TRUE)
循环遍历行,反转顺序,找出哪个条目是第一个非零;如果客户从未处理过交易length(x)
:
n <- apply(DF[, -1], 1, function(x) if (any(x)) which.max(rev(x) != 0) - 1 else length(x))
#[1] 4 0 1 5 1 2
DF$Clients[n >= 3]
#[1] Client 1 Client 4
#Levels: Client 1 Client 2 Client 3 Client 4 Client 5 Client 6
答案 1 :(得分:1)
通过基础R的另一个想法可以是,
k <- 3
df$Clients[rowSums(df[-c(1:(ncol(df) - k))] == 0) == k]
#[1] Client1 Client4
#Levels: Client1 Client2 Client3 Client4 Client5 Client6
此外,我们可以转换为long,获取最后3个值,并且filter
所有这些值为0.然后pull
Clients
。通过dplyr
,
library(dplyr)
k <- 3
v1 <- df %>%
gather(var, val, -Clients) %>%
group_by(Clients) %>%
slice((n()-k):n()) %>%
filter(all(val == 0)) %>%
pull(Clients)
unique(v1)
#[1] Client1 Client4
#Levels: Client1 Client2 Client3 Client4 Client5 Client6
答案 2 :(得分:0)
data <- data.frame(Clients = c("Client 1", "Client 2", "Client 3", "Client 4", "Client 5", "Client 6"),
Jan = c(123,549,500,126,546,0),
Feb = c(768,542,0,545,546,0),
Mar = c(678,21,500,2315,0,0),
Apr= c(452,321,0,0,0,25),
May= c(213,31,500,268,0,78),
Jun= c(123,59,0,126,328,563),
Jul= c(55,998,500,56,486,698),
Aug= c(10,0,0,0,326,631),
Sep= c(0,546,500,0,0,230),
Oct= c(0,980,0,0,0,53),
Nov= c(0,0,500,0,66,0),
Dec= c(0,987,0,0,0,0))
data_Clean <- data %>%
mutate(Client_Stat = rowSums(data[,(ncol(data)-2):ncol(data)]))%>%
mutate(Client_Status = ifelse(Client_Stat < 1,"Left","with us"))
在这种情况下,您将只获得过去3个月内没有交易的客户。
描述:我们总结了最后3列并检查了如果总和值大于0而不是他在我们身边,或者客户离开......
希望这有用。