我需要知道2天后有多少访客从未来过。这是对第一次访客的分析。我从7月到12月有6个月的时间段,在那个时间段内某个访问号码= 1的人被认为是第一次访问者。
假设我有以下简单的数据框:
<?php
$query = $this->db->distinct()
->select('a.user_name')
->from('wl_customers as a');
->join('tbl_bid as b','a.customers_id=b.customers_id');
->join('tbl_portfolio as c','b.portfolio_id=c.portfolio_id')
->where('c.portfolio_id',16)
->get();
print_r($query->result_array);//array of your records
我怎样才能知道2天后有多少第一次访客没来过?
在我的简单例子中,第一次访问者在2天之后从未来过,似乎是UserID 1,因为自2016年7月2日以来他从未来过两天。
答案 0 :(得分:1)
library(lubridate)
a <- data.frame("Date"=c("July 1, 2016","July 1, 2016","July 1, 2016","July 2, 2016","July 2, 2016","July 3, 2016","July 3, 2016","July 3, 2016",
"July 4, 2016","July 5, 2016","July 6, 2016"),
"UserID"=c(1, 1, 2, 3, 1, 3, 2, 2, 2, 3, 3),
"Visit No"=c(1, 2, 1, 1, 1, 4, 1, 1, 6, 7, 20))
a$ParsedDate <- strptime(a$Date,"%B %d, %Y",tz = "UTC")
**creating the variable with unique UserIDs to run the loop**
d <- unique(a$UserID)
for(i in 1:length(d))
{
#DF per UserID
adfPerUser <- a[a$UserID == d[i],]
#now create the interval variable
intervallistvar <- as.interval(min(adfPerUser$ParsedDate) + 2*24*60*60, max(adfPerUser$ParsedDate))
#DF for the UserID[i] for the two days
adfPerUser2days <- adfPerUser[adfPerUser$ParsedDate %within% intervallistvar,]
if(nrow(adfPerUser2days) >= 1)
{
cat(sprintf("User ID = %d and has visited atleast once after two days from the first time visit\n", i))
}
}
立即查看输出:
答案 1 :(得分:0)
library(dplyr)
library(lubridate)
dt <- data.frame("Date"=c("July 1, 2016","July 1, 2016","July 1, 2016","July 2, 2016","July 2, 2016","July 3, 2016","July 3, 2016","July 3, 2016",
"July 4, 2016","July 5, 2016","July 6, 2016"),
"UserID"=c(1, 1, 2, 3, 1, 3, 2, 2, 2, 3, 3),
"Visit No"=c(1, 2, 1, 1, 1, 4, 1, 1, 6, 7, 20))
dt %>%
mutate(Date = mdy(Date)) %>% # update to date format
group_by(UserID) %>% # for each user id
mutate(Date_Next = lead(Date, default=max(mdy(dt$Date))), # get date of next visit. if there's no next visit consider the latest date in the dataset
Date_Diff = as.numeric(difftime(Date_Next, Date, units="days"))) %>% # calculate difference between dates
ungroup() %>% # forget the grouping
filter(Date_Diff > 2) # return cases where difference is more than 2 days
# # A tibble: 1 × 5
# Date UserID Visit.No Date_Next Date_Diff
# <date> <dbl> <dbl> <date> <dbl>
# 1 2016-07-02 1 1 2016-07-06 4
该过程将返回用户2天后未返回的CASES,而不是USERS。如果用户在3天以上反复返回,您可能需要从此输出中获取唯一的用户ID。