重塑数据集中的多个变量

时间:2016-12-23 04:03:35

标签: r reshape

我想将下面的数据集重新整形为长格式。我在重塑中尝试了熔化functino,但我不知道如何将age1,age2,age3更改为一列。有人可以帮忙吗?谢谢。

id <- c(1,2,3,4,5)
age1 <- c(11,11,11,11,11)
age2 <- age1+2 
age3 <- age2+2 
ht1 <- c(120,130,125,121,130)
ht2 <- ht1 + 20
ht3 <- ht2 + 20
bmi1 <- c(18,19,17,18,18)
bmi2 <- c(20,18,19,21,24)
bmi3 <- c(21,21,21,24,27)

df <- data.frame(id=id,age1=age1,age2=age2,age3=age3,ht1=ht1,ht2=ht2,ht3=ht3,bmi1=bmi1,bmi2=bmi2,bmi3=bmi3)

来自

  id sex age1 age2 age3 ht1 ht2 ht3 bmi1 bmi2 bmi3
1  1   M   11   13   15 120 140 160   18   20   21
2  2   F   11   13   15 130 150 170   19   18   21
3  3   M   11   13   15 125 145 165   17   19   21
4  4   F   11   13   15 121 141 161   18   21   24
5  5   M   11   13   15 130 150 170   18   24   27

这样的事情

id sex age ht   bmi
1  M   11  120  18
1  M   13  140  20
1  M   15  160  21
2  F   11  130  19
2  F   13  150  18
2  F   15  170  21
3  M   11  165  17
...

3 个答案:

答案 0 :(得分:3)

使用function generateCouponCode($length = 6) { $chars = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'; $ret = ''; for($i = 0; $i < $length; ++$i) { $random = str_shuffle($chars); $ret .= $random[0]; } return $ret; }

dplyr-tidyr

带输出:

library(dplyr)
library(tidyr)
df %>%
  gather(key, value, -id) %>%
  extract(key, c("var", "num"), "(age|ht|bmi)([0-9]+)") %>%
  spread(var, value)

答案 1 :(得分:2)

我们可以使用melt中的data.table执行此操作,patterns参数中可能需要多个measure

library(data.table)
melt(setDT(df), measure = patterns("^age", "^ht", "^bmi"), 
      value.name = c("age", "ht", "bmi"))[, variable := NULL][]
#     id sex age  ht bmi
# 1:  1   M  11 120  18
# 2:  2   F  11 130  19
# 3:  3   M  11 125  17
# 4:  4   F  11 121  18
# 5:  5   M  11 130  18
# 6:  1   M  13 140  20
# 7:  2   F  13 150  18
# 8:  3   M  13 145  19
# 9:  4   F  13 141  21
#10:  5   M  13 150  24
#11:  1   M  15 160  21
#12:  2   F  15 170  21
#13:  3   M  15 165  21
#14:  4   F  15 161  24
#15:  5   M  15 170  27

答案 2 :(得分:2)

使用reshape即可:

v <- c('age', 'ht', 'bmi')
reshape(df, dir = 'long', varying = lapply(v, grep, names(df)), v.names = v)

#     id time age  ht bmi
# 1.1  1    1  11 120  18
# 2.1  2    1  11 130  19
# 3.1  3    1  11 125  17
# 4.1  4    1  11 121  18
# 5.1  5    1  11 130  18
# 1.2  1    2  13 140  20
# 2.2  2    2  13 150  18
# 3.2  3    2  13 145  19
# 4.2  4    2  13 141  21
# 5.2  5    2  13 150  24
# 1.3  1    3  15 160  21
# 2.3  2    3  15 170  21
# 3.3  3    3  15 165  21
# 4.3  4    3  15 161  24
# 5.3  5    3  15 170  27