我正在处理一个包含多份问卷的数据集,这些问卷本应填写在不同的时间点,即
173 9/13/2013 10/29/2013 9/26/2014
174 10/21/2013 11/25/2013 11/3/2014
175 7/1/2014 7/3/2015 4/27/2016
176 1/15/2014 2/24/2014 6/10/2015
177 3/15/2014 4/1/2015
178 7/18/2014 9/18/2014 8/17/2015
179 6/30/2013 8/15/2013 7/15/2014
180 4/22/2013 6/24/2013 5/11/2014
181 12/7/2014 12/26/2015
182 4/2/2015 5/17/2015 4/20/2016
183 1/12/2015 2/26/2015 1/28/2016
184 7/18/2014 8/26/2014 8/14/2015
185 8/27/2013 10/19/2013 9/21/2014
186 10/29/2013 11/30/2013 11/6/2014
187 9/17/2014 11/18/2014 10/20/2015
188 5/10/2014 6/27/2014 6/1/2015
189 10/4/2013 10/5/2014
190 1/22/2013 4/11/2013
191 10/21/2014 10/21/2014
我想知道如何在同一天看到有多少参与者填写了所有问卷,有多少参与者在同一天填写了至少2份问卷。在同一天至少有3个等等 任何帮助将受到高度赞赏。
可重复数据:
Label = c(
"1/25/2015", "1/25/2016", "1/26/2014", "1/26/2015", "1/27/2014",
"1/27/2015", "1/28/2014", "1/28/2015", "1/29/2015", "1/3/2014",
"1/3/2015", "1/3/2016", "1/30/2015", "1/31/2014", "1/4/2014",
"1/4/2015", "1/4/2016", "1/5/2014", "1/5/2015", "1/6/2014",
"1/6/2015", "1/7/2014", "1/7/2015", "1/8/2014", "1/8/2015",
"1/9/2014", "1/9/2015", "1/9/2016", "10/1/2012", "10/1/2013",
"10/1/2014", "10/1/2015", "10/10/2013", "10/10/2014", "10/11/2013",
"10/11/2014", "10/11/2015", "10/12/2013", "10/12/2014", "10/12/2015",
"10/13/2013", "10/13/2014", "10/13/2015", "10/14/2013", "10/14/2014",
"10/14/2015", "10/15/2014", "10/15/2015", "10/16/2013", "10/16/2014",
"10/16/2015", "10/17/2013", "10/17/2014", "10/17/2015", "10/18/2013",
"10/18/2014", "10/18/2015", "10/19/2013", "10/19/2014", "10/19/2015",
"10/2/2013", "10/2/2014", "10/20/2013", "10/20/2014", "10/20/2015",
"10/21/2013", "10/21/2014", "10/22/2013", "10/22/2014", "10/22/2015",
"10/23/2012", "10/23/2013", "10/23/2014", "10/23/2015", "10/24/2013",
"10/24/2014", "10/24/2015", "10/25/2013", "10/25/2014", "10/26/2013",
"10/26/2014", "10/26/2015", "10/27/2013", "10/27/2014", "10/27/2015",
"10/28/2013", "10/28/2014", "10/29/2013", "10/29/2014", "10/3/2014",
"10/3/2015", "10/30/2014", "10/31/2012", "10/31/2013", "10/31/2014",
"10/31/2015", "10/4/2013", "10/4/2014", "10/4/2015", "10/5/2014",
"10/5/2015", "10/6/2013", "10/6/2014", "10/6/2015", "10/7/2013",
"10/7/2014", "10/8/2012", "10/8/2014", "10/8/2015", "10/9/2013",
"10/9/2014", "10/9/2015", "11/1/2013", "11/1/2014", "11/1/2015",
class = "factor")
Label = c(
"4/6/2015", "4/7/2015", "4/9/2012", "5/12/2015", "5/13/2014",
"5/14/2015", "5/15/2014", "5/15/2015", "5/17/2014", "5/19/2014",
"5/20/2014", "5/25/2014", "5/27/2014", "5/29/2014", "5/30/2014",
"5/30/2015", "5/31/2015", "5/4/2014", "5/9/2015", "6/1/2015",
"6/10/2014", "6/11/2014", "6/11/2015", "6/12/2015", "6/16/2014",
"6/16/2015", "6/18/2014", "6/21/2014", "6/24/2015", "6/25/2014",
"6/25/2015", "6/26/2015", "6/27/2015", "6/29/2015", "6/5/2014",
"6/6/2015", "6/8/2014", "7/1/2014", "7/13/2014", "7/14/2015",
"7/16/2014", "7/2/2014", "7/21/2014", "7/25/2014", "7/27/2014",
"7/27/2015", "7/28/2014", "7/29/2014", "7/30/2014", "7/31/2014",
"7/31/2015", "7/4/2014", "7/4/2015", "8/1/2014", "8/11/2014",
"8/11/2015", "8/25/2014", "8/27/2015", "8/5/2014", "8/8/2014",
"8/9/2015", "9/1/2014", "9/10/2015", "9/15/2015", "9/22/2013",
"9/3/2012", "9/30/2014", "9/8/2014", "9/8/2015"), class = "factor")
Label = c(" ",
"1/16/2016", "1/26/2015", "10/11/2015", "10/14/2015", "10/16/2015",
"10/6/2014", "10/7/2013", "11/11/2015", "11/15/2015", "11/17/2013",
"11/18/2013", "11/2/2015", "11/20/2013", "11/29/2013", "2/17/2014",
"2/17/2015", "2/21/2015", "2/23/2014", "2/25/2014", "2/25/2015",
"3/11/2016", "3/2/2014", "3/22/2015", "3/4/2014", "3/4/2016",
"4/11/2014", "4/12/2013", "4/18/2016", "4/21/2015", "4/23/2015",
"4/29/2015", "4/3/2015", "4/5/2016", "5/23/2015", "5/26/2015",
"5/27/2015", "5/28/2015", "5/29/2014", "5/29/2015", "5/8/2015",
"6/16/2015", "6/22/2015", "6/28/2015", "7/24/2015", "7/27/2015",
"7/4/2014", "7/8/2015", "9/14/2015", "9/15/2015", "9/16/2014",
"9/17/2014", "9/22/2014", "9/23/2014", "9/24/2014", "9/24/2015",
"9/26/2014", "9/28/2015", "9/30/2015", "9/9/2015"), class = "factor")), .Names = c("1A_RespDate",
"1B_RespDate", "1C_1_RespDate", "1C_2_RespDate",
"1C_RespDate", "2A_1_RespDate", "2A_RespDate", "2B_RespDate",
"2C_RespDate"), row.names = c(NA, -4831L), class = "data.frame")
答案 0 :(得分:0)
我打电话给你dataframe df:
sapply(apply(df,1,unique),length)
将为您提供每个人作为向量的唯一日期数。最高值为7,最小值为1(所有问卷都在同一天回答)。
which(sapply(apply(df,1,unique),length)<7)
将为您提供在同一天填写至少2份问卷的个人的索引。
length(which(sapply(apply(df,1,unique),length)<7))
会告诉你有多少人在同一天填写了至少2份问卷。
编辑: 这是不优雅的(必须有一个更清洁的方式),但它似乎工作
which(sapply(sapply(sapply(apply(df,1,table),function(x) x==Z),which),function(x) any(x>0)))
Z应设置为同一天填写的问卷数量 解释:
apply(df,1,table)
列出了每个人的唯一日期和出现次数。
sapply(apply(df,1,table),function(x) x==Z)
对于日期是否恰好出现Z次,会给出与True / False相同的列表。
sapply(sapply(apply(df,1,table),function(x) x==Z),which)
将给出&#34; interger(0)&#34;或一个正整数,它是个人日期的索引(它不是我们感兴趣的东西)。
sapply(sapply(sapply(apply(df,1,table),function(x) x==Z),which),function(x) any(x>0))
将给出与个人索引相对应的True / False向量 然后下一步用&#34;&#34;是获得True的索引 因此,我们得到的日期恰好是Z次。