按两列数据帧R中的条件求和

时间:2014-07-02 05:32:39

标签: r dataframe aggregate

我有像这样的数据框

date            apps    long
10/22/2013 23:51    A   2
10/22/2013 23:52    B   3
10/22/2013 23:52    C   1
10/23/2013 7:03     C   5
10/23/2013 7:13     A   1
10/23/2013 7:31     B   4
10/23/2013 7:31     A   5
10/23/2013 7:31     B   2
10/24/2013 0:54     B   3
10/24/2013 1:16     C   2
10/24/2013 1:16     C   1
10/24/2013 3:27     A   2
10/24/2013 7:30     A   3
10/24/2013 7:30     A   1

我遇到的问题是: 我想总结A,B,C应用程序每天花多少时间。所以输出看起来像:

A 10/22/2013 2
A 10/23/2013 6
A 10/24/2013 6
etc...

我尝试了一些语法,但没有用,谢谢

3 个答案:

答案 0 :(得分:2)

首先,我假设您的data.frame被称为dd。这是一种复制/可粘贴的形式

dd <- structure(list(date = structure(c(1L, 2L, 2L, 3L, 4L, 5L, 5L, 
5L, 6L, 7L, 7L, 8L, 9L, 9L), .Label = c("10/22/2013 23:51", "10/22/2013 23:52", 
"10/23/2013 7:03", "10/23/2013 7:13", "10/23/2013 7:31", "10/24/2013 0:54", 
"10/24/2013 1:16", "10/24/2013 3:27", "10/24/2013 7:30"), class = "factor"), 
    apps = structure(c(1L, 2L, 3L, 3L, 1L, 2L, 1L, 2L, 2L, 3L, 
    3L, 1L, 1L, 1L), .Label = c("A", "B", "C"), class = "factor"), 
    long = c(2L, 3L, 1L, 5L, 1L, 4L, 5L, 2L, 3L, 2L, 1L, 2L, 
    3L, 1L)), .Names = c("date", "apps", "long"), class = "data.frame", row.names = c(NA, 
-14L))

您应该将日期转换为正确的日期值

dd$date <- as.POSIXct(as.character(dd$date), format="%m/%d/%Y %H:%M", tz="GMT")

然后,您可以使用aggregate创建一个不错的data.frame,此处使用as.Date来消除时间

aggregate(long ~ as.Date(date) + apps, dd, FUN=sum)

返回

  as.Date(date) apps long
1    2013-10-22    A    2
2    2013-10-23    A    6
3    2013-10-24    A    6
4    2013-10-22    B    3
5    2013-10-23    B    6
6    2013-10-24    B    3
7    2013-10-22    C    1
8    2013-10-23    C    5
9    2013-10-24    C    3

答案 1 :(得分:0)

我很确定这在某处重复,但我在前三次搜索中失败了,所以这里是:

tapply( dat$long, list(dt = format( as.POSIXct(dat$date, "%d-%m-%Y %H:%M"), 
                                    "%d-%m-%Y"),
                        grp=dat$apps ),
                  sum)

答案 2 :(得分:0)

在Mr.Flick&#39; s dplyr

上使用dd
library(dplyr)
dd%>% 
group_by(apps, date=gsub("\\s+.*","",date))%>%
summarize(long=sum(long))
#      apps       date long
# 1    A 10/22/2013    2
# 2    A 10/23/2013    6
# 3    A 10/24/2013    6
# 4    B 10/22/2013    3
# 5    B 10/23/2013    6
# 6    B 10/24/2013    3
# 7    C 10/22/2013    1
# 8    C 10/23/2013    5
# 9    C 10/24/2013    3