我有一个看起来像这样的表:
Year Tax1 Tax2 Tax3 Tax4
2004 12 123 145 104
2004 145 99 90 56
2005 212 300 240 123
等...
Tax#列提供有关Year列中值后的年份中所支付的税款的信息。我想重新安排表格,并重命名列,所以看起来像这样:
Year Tax2004 Tax2005 Tax2006 Tax2007 Tax2008
2004 12 123 145 104 NA
2004 145 99 90 56 NA
2005 NA 212 300 240 123
我当时正在考虑根据Year列将表拆分为单独的表,然后重命名Tax#列,然后重新合并在一起。但这有点令人费解,我想知道是否有更简单的方法来做到这一点?
非常感谢任何帮助。
答案 0 :(得分:4)
library(dplyr)
library(tidyr)
df <- read.table(text = "
Year Tax1 Tax2 Tax3 Tax4
2004 12 123 145 104
2004 145 99 90 56
2005 212 300 240 123
", header = TRUE)
df %>%
mutate(id = row_number()) %>%
gather(rel_year, amount, contains("Tax")) %>%
mutate(rel_year = as.integer(gsub("Tax", "", rel_year)),
pay_year = Year + rel_year - 1,
pay_year = paste0("Tax", pay_year)) %>%
select(-rel_year) %>%
spread(pay_year, amount)
结果:
Year id Tax2004 Tax2005 Tax2006 Tax2007 Tax2008
1 2004 1 12 123 145 104 NA
2 2004 2 145 99 90 56 NA
3 2005 3 NA 212 300 240 123
答案 1 :(得分:1)
dat1%>%
gather(key,value,-Year)%>%
group_by(key)%>%
mutate(col=1:n())%>%
ungroup()%>%
mutate(key=paste0("Tax",2004:2008)[(Year==2005)+
as.numeric(sub("\\D+","",key))])%>%
spread(key,value)
# A tibble: 3 x 7
Year col Tax2004 Tax2005 Tax2006 Tax2007 Tax2008
<int> <int> <int> <int> <int> <int> <int>
1 2004 1 12 123 145 104 NA
2 2004 2 145 99 90 56 NA
3 2005 3 NA 212 300 240 123
>
答案 2 :(得分:1)
这里是使用data.table
library(data.table)
library(readr)
dcast(melt(setDT(df, keep.rownames = TRUE), id.var = c("rn", "Year"))[,
newYear := paste0("Tax", Year + parse_number(variable) - 1)],
rn + Year~ newYear, value.var = 'value')[, rn := NULL][]
# Year Tax2004 Tax2005 Tax2006 Tax2007 Tax2008
#1: 2004 12 123 145 104 NA
#2: 2004 145 99 90 56 NA
#3: 2005 NA 212 300 240 123