不幸的是,我需要完全重新编辑帖子,因为我最初建议的数据片段并不完全反映我的数据集。基本上我的数据集如下所示:
date id person member_since consistent
May 2015 2732 3041 2008-09-17 Yes
Jul 2015 2732 3041 2008-09-17 Yes
Sep 2015 2732 3041 2008-09-17 Yes
Nov 2015 2732 3041 2008-09-17 Yes
Jan 2016 2732 3041 2008-09-17 Yes
May 2016 2732 3041 2008-09-17 Yes
Jul 2016 2732 3041 2008-09-17 Yes
Aug 2016 2732 3041 2008-09-17 Yes
Mar 2017 2732 3041 2008-09-17 Yes
Apr 2017 2732 3041 2008-09-17 Yes
May 2017 2732 3041 2008-09-17 Yes
我现在想要按月创建条目,直到" member_since"的月份,保持一切不变(见下文)。此外,还会有一些变量标记为"一致" =否(此处未显示),应保持原样。
date id person member_since consistent
Sep 2008 2732 3041 2008-09-17 Yes
Oct 2008 2732 3041 2008-09-17 Yes
Nov 2008 2732 3041 2008-09-17 Yes
.
.
.
May 2015 2732 3041 2008-09-17 Yes
Jul 2015 2732 3041 2008-09-17 Yes
Sep 2015 2732 3041 2008-09-17 Yes
Nov 2015 2732 3041 2008-09-17 Yes
Jan 2016 2732 3041 2008-09-17 Yes
May 2016 2732 3041 2008-09-17 Yes
Jul 2016 2732 3041 2008-09-17 Yes
Aug 2016 2732 3041 2008-09-17 Yes
Mar 2017 2732 3041 2008-09-17 Yes
Apr 2017 2732 3041 2008-09-17 Yes
May 2017 2732 3041 2008-09-17 Yes
你能帮我解释一下代码吗?
我认为下面提出的代码是在正确的轨道上,但是还没有达到预期的结果。
library(dplyr)
library(zoo)
monthly_series <- (seq.Date(min(as.Date(df$member_since)),
as.Date("2017-05-01"),by="months"))
df2 <-
df%>%
group_by(Person)%>%
merge(monthly_series)%>%
dplyr::filter((consistent=="Yes"&as.yearmon(as.Date(member_since))<=as.yearmon(y))|
(consistent=="No"& as.yearmon(as.Date(member_since))==as.yearmon(y))) %>%
mutate(Month=as.yearmon(y)) %>%
select(-y) %>%
arrange(Person)
提前多多感谢!
答案 0 :(得分:1)
这是通过Base R的想法,
#convert variables to dates and characters,
df$Member_since <- as.Date(df$Member_since, format = '%Y-%m-%d')
df[c(1, 3)] <- lapply(df[c(1,3)], as.character)
str(df) 'data.frame': 3 obs. of 3 variables: $ Person : chr "A" "B" "C" $ Member_since: Date, format: "2013-08-04" "2011-12-23" "2014-09-03" $ Applicable : chr "Yes" "No" "Yes"
#create a list with monthly sequence of dates
l1 <- sapply(df$Member_since[df$Applicable == 'Yes'], function(i)
seq.Date(i, as.Date('2017-05-01'), by = 'month'))
#Create an extra variable based on the lengths of l1 to act as counter for expanding the data frame
df$cnt[df$Applicable == 'Yes'] <- lengths(l1)
#Change NA to 1 so it does not expand
df[is.na(df)] <- 1
#expand data frame
df1 <- df[rep(rownames(df), df$cnt),]
#create your Month variable by substituting the dates from l1
df1$Month[df1$Applicable == 'Yes'] <- format(do.call(c, l1), format = '%b %Y')
#take care of the NAs emerged from applicable being NO
df1$Month <- mapply(function(x, y)
replace(x, is.na(x), format(y, '%b %Y')), df1$Month, df1$Member_since)
#drop unwanted columns
df1$cnt <- NULL
head(df1)
# Person Member_since Applicable Month
#1 A 2013-08-04 Yes Aug 2013
#1.1 A 2013-08-04 Yes Sep 2013
#1.2 A 2013-08-04 Yes Oct 2013
#1.3 A 2013-08-04 Yes Nov 2013
#1.4 A 2013-08-04 Yes Dec 2013
#1.5 A 2013-08-04 Yes Jan 2014
tail(df1)
# Person Member_since Applicable Month
#3.26 C 2014-09-03 Yes Nov 2016
#3.27 C 2014-09-03 Yes Dec 2016
#3.28 C 2014-09-03 Yes Jan 2017
#3.29 C 2014-09-03 Yes Feb 2017
#3.30 C 2014-09-03 Yes Mar 2017
#3.31 C 2014-09-03 Yes Apr 2017
数据强>
dput(df)
structure(list(Person = structure(1:3, .Label = c("A", "B", "C"
), class = "factor"), Member_since = structure(c(2L, 1L, 3L), .Label = c("2011-12-23",
"2013-08-04", "2014-09-03"), class = "factor"), Applicable = structure(c(2L,
1L, 2L), .Label = c("No", "Yes"), class = "factor")), .Names = c("Person",
"Member_since", "Applicable"), class = "data.frame", row.names = c(NA,
-3L))
答案 1 :(得分:1)
以下是dplyr
在yearmon
包zoo
的帮助下如何做到这一点。
基本上,我们的想法是使用原始data.frame创建一个带有seq.Date
和merge
的月度日期向量。然后,您可以filter
输出不需要的行。我正在使用dplyr::filter
因为filter
被我机器上的另一个包屏蔽了。在你的身上,单独使用filter
就足够了。
library(dplyr);library(zoo)
df <- read.table(text="Person Member_since Applicable
A 2013-08-04 Yes
B 2011-12-23 No
C 2014-09-03 Yes",header=TRUE,stringsAsFactors=FALSE)
monthly_series <- (seq.Date(min(as.Date(df$Member_since)),
as.Date("2017-05-01"),by="months"))
df%>%
group_by(Person)%>%
merge(monthly_series)%>%
dplyr::filter((Applicable=="Yes"&as.yearmon(as.Date(Member_since))<=as.yearmon(y))|
(Applicable=="No"& as.yearmon(as.Date(Member_since))==as.yearmon(y))) %>%
mutate(Month=as.yearmon(y)) %>%
select(-y) %>%
arrange(Person)
Person Member_since Applicable Month
1 A 2013-08-04 Yes Aug 2013
2 A 2013-08-04 Yes Sep 2013
3 A 2013-08-04 Yes Oct 2013
4 A 2013-08-04 Yes Nov 2013
5 A 2013-08-04 Yes Dec 2013
6 A 2013-08-04 Yes Jan 2014
7 A 2013-08-04 Yes Feb 2014
8 A 2013-08-04 Yes Mar 2014
9 A 2013-08-04 Yes Apr 2014
10 A 2013-08-04 Yes May 2014
11 A 2013-08-04 Yes Jun 2014
12 A 2013-08-04 Yes Jul 2014
13 A 2013-08-04 Yes Aug 2014
14 A 2013-08-04 Yes Sep 2014
15 A 2013-08-04 Yes Oct 2014
16 A 2013-08-04 Yes Nov 2014
17 A 2013-08-04 Yes Dec 2014
18 A 2013-08-04 Yes Jan 2015
19 A 2013-08-04 Yes Feb 2015
20 A 2013-08-04 Yes Mar 2015
21 A 2013-08-04 Yes Apr 2015
22 A 2013-08-04 Yes May 2015
23 A 2013-08-04 Yes Jun 2015
24 A 2013-08-04 Yes Jul 2015
25 A 2013-08-04 Yes Aug 2015
26 A 2013-08-04 Yes Sep 2015
27 A 2013-08-04 Yes Oct 2015
28 A 2013-08-04 Yes Nov 2015
29 A 2013-08-04 Yes Dec 2015
30 A 2013-08-04 Yes Jan 2016
31 A 2013-08-04 Yes Feb 2016
32 A 2013-08-04 Yes Mar 2016
33 A 2013-08-04 Yes Apr 2016
34 A 2013-08-04 Yes May 2016
35 A 2013-08-04 Yes Jun 2016
36 A 2013-08-04 Yes Jul 2016
37 A 2013-08-04 Yes Aug 2016
38 A 2013-08-04 Yes Sep 2016
39 A 2013-08-04 Yes Oct 2016
40 A 2013-08-04 Yes Nov 2016
41 A 2013-08-04 Yes Dec 2016
42 A 2013-08-04 Yes Jan 2017
43 A 2013-08-04 Yes Feb 2017
44 A 2013-08-04 Yes Mar 2017
45 A 2013-08-04 Yes Apr 2017
46 B 2011-12-23 No Dec 2011
47 C 2014-09-03 Yes Sep 2014
48 C 2014-09-03 Yes Oct 2014
49 C 2014-09-03 Yes Nov 2014
50 C 2014-09-03 Yes Dec 2014
51 C 2014-09-03 Yes Jan 2015
52 C 2014-09-03 Yes Feb 2015
53 C 2014-09-03 Yes Mar 2015
54 C 2014-09-03 Yes Apr 2015
55 C 2014-09-03 Yes May 2015
56 C 2014-09-03 Yes Jun 2015
57 C 2014-09-03 Yes Jul 2015
58 C 2014-09-03 Yes Aug 2015
59 C 2014-09-03 Yes Sep 2015
60 C 2014-09-03 Yes Oct 2015
61 C 2014-09-03 Yes Nov 2015
62 C 2014-09-03 Yes Dec 2015
63 C 2014-09-03 Yes Jan 2016
64 C 2014-09-03 Yes Feb 2016
65 C 2014-09-03 Yes Mar 2016
66 C 2014-09-03 Yes Apr 2016
67 C 2014-09-03 Yes May 2016
68 C 2014-09-03 Yes Jun 2016
69 C 2014-09-03 Yes Jul 2016
70 C 2014-09-03 Yes Aug 2016
71 C 2014-09-03 Yes Sep 2016
72 C 2014-09-03 Yes Oct 2016
73 C 2014-09-03 Yes Nov 2016
74 C 2014-09-03 Yes Dec 2016
75 C 2014-09-03 Yes Jan 2017
76 C 2014-09-03 Yes Feb 2017
77 C 2014-09-03 Yes Mar 2017
78 C 2014-09-03 Yes Apr 2017
答案 2 :(得分:1)
你应该检查这个包:https://cran.r-project.org/web/packages/padr/vignettes/padr.html
您的答案将如下所示: let container = CKContainer(identifier: kContainerName)
let containerDescription = String(describing: container)
if containerDescription.contains("containerEnvironment=\"Sandbox\"") {
showTestMode() // or whatever you want to do differently
}
。
然后我会df2 <- df %>% filter(Applicable == "Yes") %>% pad(group = "Person", start_val = Member_since, end_val = "2017-05-04", interval = "month")
,这样你就可以在一个单独的数据集中得到所有“否”的答案。
最后,您可以使用append_df <- df %>% filter(Applicable == "No")
重新添加。这应该会让你走上正确的道路。
答案 3 :(得分:0)
这是您拥有的众多选项的plyr
替代方案。您的示例数据:
library(plyr)
df <- read.table( text = "Person Member_since Applicable
A 2013-08-04 Yes
B 2011-12-23 No
C 2014-09-03 Yes", header = TRUE)
确保Member_since
采用日期格式:
df$Member_since <- as.Date(df$Member_since)
如果adply
为“是”,则使用Applicable
迭代行并使用连续月份构建新行,如果Applicable
为“否”,则使用原始行。
df2<-adply(df, 1, function(x) {
if(x$Applicable == "Yes"){
data.frame(x, month = format(
seq.Date(as.Date(x$Member_since), as.Date("2017-06-01"), by = "month"),
"%b %Y"))
}
else{
data.frame(x, month = format(x$Member_since, "%b %Y"))
}
}
)
> head(df2);tail(df2)
Person Member_since Applicable month
1 A 2013-08-04 Yes Aug 2013
2 A 2013-08-04 Yes Sep 2013
3 A 2013-08-04 Yes Oct 2013
4 A 2013-08-04 Yes Nov 2013
5 A 2013-08-04 Yes Dec 2013
6 A 2013-08-04 Yes Jan 2014
Person Member_since Applicable month
75 C 2014-09-03 Yes Dec 2016
76 C 2014-09-03 Yes Jan 2017
77 C 2014-09-03 Yes Feb 2017
78 C 2014-09-03 Yes Mar 2017
79 C 2014-09-03 Yes Apr 2017
80 C 2014-09-03 Yes May 2017