R:数据帧 - 用平均值添加行

时间:2016-05-26 13:41:36

标签: r dataframe

我有一个像这样的数据框

subject_id area side value confound1 confound2 confound3
s01 A left 5 154 952 no
s01 A right 7 154 952 no
s01 B left 15 154 952 no
s01 B right 17 154 952 no
s02 A left 3 130 870 yes
s02 A right 5 130 870 yes
s02 B left 12 130 870 yes
s02 B right 11 130 870 yes

我想为每个主题的每个区域添加左右平均行,同时保留其他变量的值:

subject_id area side value confound1 confound2 confound3
s01 A left 5 154 952 no
s01 A right 7 154 952 no
s01 A avg 6 154 952 no
s01 B left 15 154 952 no
s01 B right 17 154 952 no
s01 B avg 16 154 952 no
s02 A left 3 130 870 yes
s02 A right 5 130 870 yes
s02 A avg 4 130 870 yes
s02 B left 12 130 870 yes
s02 B right 11 130 870 yes
s02 B avg 11.5 130 870 yes

有关如何执行此操作的任何建议吗?

4 个答案:

答案 0 :(得分:3)

以下是基本R函数aggregaterbind的方法。

# get the data
 df <- read.table(header=T, text="subject_id area side value confound1 confound2 confound3
 s01 A left 5 154 952 no
                  s01 A right 7 154 952 no
                  s01 B left 15 154 952 no
                  s01 B right 17 154 952 no
                  s02 A left 3 130 870 yes
                  s02 A right 5 130 870 yes
                  s02 B left 12 130 870 yes
                  s02 B right 11 130 870 yes")

# get the average values
dfAgg <- aggregate(cbind(value=value, confound1=confound1, 
                         confound2=confound2, confound3=confound3) ~ 
                     subject_id + area, data=df, FUN=mean)
# add variables
dfAgg$side <- "side.avg"
dfAgg$confound3 <- factor(dfAgg$confound3, labels=c("no", "yes"))

#rbind the averages    
dfFinal <- rbind(df, dfAgg)

# order the data
dfFinal <- dfFinal[order(dfFinal$subject_id, dfFinal$area, dfFinal$side),]

答案 1 :(得分:2)

使用库dplyr,您可以执行以下操作:

library(dplyr)
df %>% group_by(subject_id, area) %>% mutate(mean_left_right = mean(value))

输出是:

Source: local data frame [8 x 8]
Groups: subject_id, area [4]

  subject_id  area  side value confound1 confound2 confound3 mean_left_right
       <chr> <chr> <chr> <int>     <int>     <int>     <chr>           <dbl>
1        s01     A  left     5       154       952        no             6.0
2        s01     A right     7       154       952        no             6.0
3        s01     B  left    15       154       952        no            16.0
4        s01     B right    17       154       952        no            16.0
5        s02     A  left     3       130       870       yes             4.0
6        s02     A right     5       130       870       yes             4.0
7        s02     B  left    12       130       870       yes            11.5
8        s02     B right    11       130       870       yes            11.5

答案 2 :(得分:1)

我会使用tidyr收集并传播您的数据。

library(dplyr)
library(tidyr)

df %>%
  spread(side, value) %>%
  mutate(avg = (left + right)/2) %>%
  gather(side, value, left:avg)

       subject_id area confound1 confound2 confound3  side value
1         s01    A       154       952        no  left   5.0
2         s01    B       154       952        no  left  15.0
3         s02    A       130       870       yes  left   3.0
4         s02    B       130       870       yes  left  12.0
5         s01    A       154       952        no right   7.0
6         s01    B       154       952        no right  17.0
7         s02    A       130       870       yes right   5.0
8         s02    B       130       870       yes right  11.0
9         s01    A       154       952        no   avg   6.0
10        s01    B       154       952        no   avg  16.0
11        s02    A       130       870       yes   avg   4.0
12        s02    B       130       870       yes   avg  11.5

答案 3 :(得分:1)

使用middle

的选项
<paper-toolbar class="toolbar" middle-justify="center">
     <span class="title">Toolbar</span>
     <paper-input class="middle" label="text input"></paper-input>
</paper-toolbar>