仅获取符合特定条件的列的平均值

时间:2013-02-15 15:15:13

标签: r dataframe

我有一个如下所示的数据框:

id                              weekdays              halflife
241732222300860000  Friday, Aug 31, 2012, 22    0.4166666667
241689170123309000  Friday, Aug 31, 2012, 19    0.3833333333
241686878137512000  Friday, Aug 31, 2012, 19    0.4
241651117396738000  Friday, Aug 31, 2012, 16    1.5666666667
241635163505820000  Friday, Aug 31, 2012, 15    0.95
241633401382265000  Friday, Aug 31, 2012, 15    2.3666666667

我希望获得周一创建的项目的平均半衰期,然后是星期二......等等。 (我的日期范围超过6个月)。请告诉我如何提供可重现的代码,因为我无法找到附加文件的方法。

要获取日期值,我使用了strptime和difftime。另外,我发现max(df $ halflife)的最大半衰期,我怎样才能找到它对应的id?

可重复的代码:

structure(list(id = c(241732222300860416, 241689170123309056, 
241686878137511936, 241651117396738048, 241635163505819648, 241633401382264832
), weekdays = c("Friday, Aug 31, 2012, 22", "Friday, Aug 31, 2012, 19", 
"Friday, Aug 31, 2012, 19", "Friday, Aug 31, 2012, 16", "Friday, Aug 31, 2012, 15", 
"Friday, Aug 31, 2012, 15"), halflife = structure(c(0.416666666666667, 
0.383333333333333, 0.4, 1.56666666666667, 0.95, 2.36666666666667
), class = "difftime", units = "mins")), .Names = c("id", 
"weekdays", "halflife"), row.names = c(NA, 6L), class = "data.frame")

1 个答案:

答案 0 :(得分:4)

可能有更好的方式来获取工作日,但您可以像这样使用tapply(此处df是数据框的名称):

days <- sub(",.*$", "", df$weekdays)
tapply(df$halflife, days, mean)

要获取最大值的ID,请使用which

df$id[which(df$halflife==max(df$halflife))]