如何为每个唯一因子值使用一次apply函数

时间:2015-07-18 09:37:41

标签: r

我在R-studio内置数据库ChickWeight上尝试了一些命令。数据如下所示。

   weight Time Chick Diet
1      42    0     1    1
2      51    2     1    1
3      59    4     1    1
4      64    6     1    1
5      76    8     1    1
6      93   10     1    1
7     106   12     1    1
8     125   14     1    1
9     149   16     1    1
10    171   18     1    1
11    199   20     1    1
12    205   21     1    1
13     40    0     2    1
14     49    2     2    1
15     58    4     2    1

现在我想做的是简单地输出鸡肉重量之间的差异" Chick"时间0和21的列(上次时间值)。即小鸡的重量。

我一直在尝试tapply(ChickWeight$weight, ChickWeight$Chick, function(x) x[length(x)] - x[1])。但这当然会将值应用于所有行。

如何制作它,使其仅对每个独特的Chick值应用一次?

2 个答案:

答案 0 :(得分:3)

如果我们每个因素需要一个值,那么'专栏(假设' Chick'和' Diet'是因子列)

library(data.table)
setDT(df1)[, list(Diff= abs(weight[Time==21]-weight[Time==0])) ,.(Chick, Diet)]

如果我们需要创建一个列

 setDT(df1)[,  Diff:= abs(weight[Time==21]-weight[Time==0]) ,.(Chick, Diet)]

我注意到在小鸡No:2中找不到示例Time = 21,可能在这种情况下,我们需要其中一个

setDT(df1)[, {tmp <- Time %in% c(0,21)
  list(Diff= if(sum(tmp)>1) abs(diff(weight[tmp])) else weight[tmp]) } ,
                by =  .(Chick, Diet)]
#    Chick Diet Diff
#1:     1    1  163
#2:     2    1   40

如果我们正在采取“重量”的差异。基于maxmin&#39;时间&#39;对于每个小组

 setDT(df1)[, list(Diff=weight[which.max(Time)]- 
                weight[which.min(Time)]), .(Chick, Diet)]
 #   Chick Diet Diff
 #1:     1    1  163
 #2:     2    1   18

此外,如果&#39;时间&#39;订购

setDT(df1)[, list(Diff= abs(diff(weight[c(1L,.N)]))), by =.(Chick, Diet)]

使用by

中的base R
  by(df1[1:2], df1[3:4], FUN= function(x) with(x, 
      abs(weight[which.max(Time)]-weight[which.min(Time)])))
  #Chick: 1
  #Diet: 1
  #[1] 163
  #------------------------------------------------------------ 
  #Chick: 2
  #Diet: 1
  #[1] 18

答案 1 :(得分:2)

以下是使用dplyr的解决方案:

ChickWeight %>%
  group_by(Chick = as.numeric(as.character(Chick))) %>%
  summarise(weight_gain = last(weight) - first(weight), final_time = last(Time))

(@ulfelder建议的第一个也是最后一个。)

请注意,ChickWeight$Chick是一个有序因子,因此如果不将其强制转换为数字,则最终订单看起来很奇怪。

使用基数R:

ChickWeight$Chick <- as.numeric(as.character(ChickWeight$Chick))
tapply(ChickWeight$weight, ChickWeight$Chick, function(x) x[length(x)] - x[1])