我有像这样的数据框
date apps long
10/22/2013 23:51 A 2
10/22/2013 23:52 B 3
10/22/2013 23:52 C 1
10/23/2013 7:03 C 5
10/23/2013 7:13 A 1
10/23/2013 7:31 B 4
10/23/2013 7:31 A 5
10/23/2013 7:31 B 2
10/24/2013 0:54 B 3
10/24/2013 1:16 C 2
10/24/2013 1:16 C 1
10/24/2013 3:27 A 2
10/24/2013 7:30 A 3
10/24/2013 7:30 A 1
我遇到的问题是: 我想总结A,B,C应用程序每天花多少时间。所以输出看起来像:
A 10/22/2013 2
A 10/23/2013 6
A 10/24/2013 6
etc...
我尝试了一些语法,但没有用,谢谢
答案 0 :(得分:2)
首先,我假设您的data.frame被称为dd
。这是一种复制/可粘贴的形式
dd <- structure(list(date = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L,
5L, 6L, 7L, 7L, 8L, 9L, 9L), .Label = c("10/22/2013 23:51", "10/22/2013 23:52",
"10/23/2013 7:03", "10/23/2013 7:13", "10/23/2013 7:31", "10/24/2013 0:54",
"10/24/2013 1:16", "10/24/2013 3:27", "10/24/2013 7:30"), class = "factor"),
apps = structure(c(1L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 3L,
3L, 1L, 1L, 1L), .Label = c("A", "B", "C"), class = "factor"),
long = c(2L, 3L, 1L, 5L, 1L, 4L, 5L, 2L, 3L, 2L, 1L, 2L,
3L, 1L)), .Names = c("date", "apps", "long"), class = "data.frame", row.names = c(NA,
-14L))
您应该将日期转换为正确的日期值
dd$date <- as.POSIXct(as.character(dd$date), format="%m/%d/%Y %H:%M", tz="GMT")
然后,您可以使用aggregate
创建一个不错的data.frame,此处使用as.Date
来消除时间
aggregate(long ~ as.Date(date) + apps, dd, FUN=sum)
返回
as.Date(date) apps long
1 2013-10-22 A 2
2 2013-10-23 A 6
3 2013-10-24 A 6
4 2013-10-22 B 3
5 2013-10-23 B 6
6 2013-10-24 B 3
7 2013-10-22 C 1
8 2013-10-23 C 5
9 2013-10-24 C 3
答案 1 :(得分:0)
我很确定这在某处重复,但我在前三次搜索中失败了,所以这里是:
tapply( dat$long, list(dt = format( as.POSIXct(dat$date, "%d-%m-%Y %H:%M"),
"%d-%m-%Y"),
grp=dat$apps ),
sum)
答案 2 :(得分:0)
在Mr.Flick&#39; s dplyr
dd
library(dplyr)
dd%>%
group_by(apps, date=gsub("\\s+.*","",date))%>%
summarize(long=sum(long))
# apps date long
# 1 A 10/22/2013 2
# 2 A 10/23/2013 6
# 3 A 10/24/2013 6
# 4 B 10/22/2013 3
# 5 B 10/23/2013 6
# 6 B 10/24/2013 3
# 7 C 10/22/2013 1
# 8 C 10/23/2013 5
# 9 C 10/24/2013 3