如何在R中转换此数据而不是excel

时间:2017-01-23 10:27:01

标签: r

library(plyr)

data <- data.frame(Age = as.character(c("1-5", "1-5", "6-10", "6-10",     "11-15", "11-15")),
                                  Gender = as.character(rep(c("Male","Female"),3)),
               "2001" = c(10000,9000,15000,14000,17000,15000))

data$x2002 <- data$X2001  + 1000

data

data2 <- data.frame(Age = rep(data$Age,2), Gender = rep(data$Gender,2)) %>% arrange(Gender) %>%
mutate(year = rep(c("2001", "2002"), each = 3, times = 2), rank = rep(seq(1,3), times = 4)) 


data2 <- data2 %>% mutate(N  = c(9000,14000,15000, 10000,15000,16000, 10000, 15000, 17000, 11000, 16000, 18000))

data2

目前,我已经在excel中完成了大量的手动工作,并寻求更简单的解决方案,如果可能的话。

1 个答案:

答案 0 :(得分:2)

我们可以使用gather将其转换为“long”格式,然后进行转换

library(dplyr)
library(tidyr)
gather(data, year, N, X2001:x2002) %>% 
        mutate(year = as.numeric(substring(year, 2))) %>%
        group_by(Gender, year) %>%
        mutate(rank = dense_rank(N)) %>%
        arrange(Gender, year, rank)
#     Age Gender  year     N  rank
#   <fctr> <fctr> <dbl> <dbl> <int>
#1     1-5 Female  2001  9000     1
#2    6-10 Female  2001 14000     2
#3   11-15 Female  2001 15000     3
#4     1-5 Female  2002 10000     1
#5    6-10 Female  2002 15000     2
#6   11-15 Female  2002 16000     3
#7     1-5   Male  2001 10000     1
#8    6-10   Male  2001 15000     2
#9   11-15   Male  2001 17000     3
#10    1-5   Male  2002 11000     1
#11   6-10   Male  2002 16000     2
#12  11-15   Male  2002 18000     3