我有这个数据框:
df <- data.frame(group=c("A", "A", "B", "B"), year=c(1980, 1986, 1990, 1992))
group year
1 A 1980
2 A 1986
3 B 1990
4 B 1992
我想通过以下方式修改它:
这将是结果:
group year pre
1 A 1978 pre1980
2 A 1979 pre1980
3 A 1984 pre1986
4 A 1985 pre1986
5 B 1988 pre1990
6 B 1989 pre1990
7 B 1990 pre1992
8 B 1991 pre1992
添加新列很容易..
df$pre <- paste("pre", df$year, sep="")
但我仍然坚持如何添加具有相应年份的新行(当然创建一个全新的数据框架也同样出色)。任何提示?
答案 0 :(得分:6)
base R
ftw:
data.frame(group = rep(df$group, each=2),
year = df[rep(1:nrow(df), each=2),]$year-2:1,
pre = paste0("pre",rep(df$year,each=2)))
# group year pre
# 1 A 1978 pre1980
# 2 A 1979 pre1980
# 3 A 1984 pre1986
# 4 A 1985 pre1986
# 5 B 1988 pre1990
# 6 B 1989 pre1990
# 7 B 1990 pre1992
# 8 B 1991 pre1992
答案 1 :(得分:5)
使用data.table包,这是一种方法。根据给定的数据,我决定使用year作为组变量。对于每一年,我计算前两年,并创建年份前的****。有两年的专栏,所以我最后删除了其中一列。
setDT(df)[, list(group = group,
year = c((year - 2), (year - 1)),
pre = paste0("pre", year, collapse = "")), by = "year"][, -1, with = FALSE][]
# group year pre
#1: A 1978 pre1980
#2: A 1979 pre1980
#3: A 1984 pre1986
#4: A 1985 pre1986
#5: B 1988 pre1990
#6: B 1989 pre1990
#7: B 1990 pre1992
#8: B 1991 pre1992
如果您的相同年份出现两次以上,您可能会执行以下操作。这个新的数据框架已经出现了两次。
df <- data.frame(group=c("A", "A", "B", "B"), year=c(1980, 1986, 1992, 1992))
setDT(df)[, list(group = group,
year = c((year - 2), (year - 1)),
pre = paste0("pre", year, collapse = "")), by = rownames(df)][, -1, with = FALSE]
# group year pre
#1: A 1978 pre1980
#2: A 1979 pre1980
#3: A 1984 pre1986
#4: A 1985 pre1986
#5: B 1990 pre1992
#6: B 1991 pre1992
#7: B 1990 pre1992
#8: B 1991 pre1992
答案 2 :(得分:4)
以下是Map
do.call(rbind,Map(function(x,y,z)
data.frame(group=x, year=y:z, pre=paste0('pre', z+1)),
df$group, df$year-2, df$year-1))
# group year pre
#1 A 1978 pre1980
#2 A 1979 pre1980
#3 A 1984 pre1986
#4 A 1985 pre1986
#5 B 1988 pre1990
#6 B 1989 pre1990
#7 B 1990 pre1992
#8 B 1991 pre1992
或使用rep
`row.names<-`(transform(df[rep(1:nrow(df),each=2),],
year = year-2:1, pre = paste0('pre', year) ), NULL)
# group year pre
#1 A 1978 pre1980
#2 A 1979 pre1980
#3 A 1984 pre1986
#4 A 1985 pre1986
#5 B 1988 pre1990
#6 B 1989 pre1990
#7 B 1990 pre1992
#8 B 1991 pre1992
答案 3 :(得分:1)
如果您没有完成最终订单,没有额外的库,您可以使用
gap = function(df, y) transform(df, year=year-y, pre = sprintf("pre%d", year))
rbind(gap(df,2), gap(df,1))
答案 4 :(得分:1)
这是一个没有包的简单解决方案:
您的数据框:
df <- data.frame(group=c("A", "A", "B", "B"), year=c(1980, 1986, 1990, 1992))
group year
1 A 1980
2 A 1986
3 B 1990
4 B 1992
减去两年并添加列前:
df1<-cbind(group=as.character(df$group),year=df$year-2, pre=paste("pre",df$year,sep=""))
group year pre
[1,] "A" "1978" "pre1980"
[2,] "A" "1984" "pre1986"
[3,] "B" "1988" "pre1990"
[4,] "B" "1990" "pre1992"
接下来减去1年并添加列前:
df2<-cbind(group=as.character(df$group),year=df$year-1,pre=paste("pre",df$year,sep=""))
group year pre
[1,] "A" "1979" "pre1980"
[2,] "A" "1985" "pre1986"
[3,] "B" "1989" "pre1990"
[4,] "B" "1991" "pre1992"
现在rbind
两个在一起:
ndf<-data.frame(rbind(df1,df2))
group year pre
1 A 1978 pre1980
2 A 1984 pre1986
3 B 1988 pre1990
4 B 1990 pre1992
5 A 1979 pre1980
6 A 1985 pre1986
7 B 1989 pre1990
8 B 1991 pre1992
根据year
对其进行排序。这是你的输出。
Lastdf <- ndf[order(ndf$year),]
group year pre
1 A 1978 pre1980
5 A 1979 pre1980
2 A 1984 pre1986
6 A 1985 pre1986
3 B 1988 pre1990
7 B 1989 pre1990
4 B 1990 pre1992
8 B 1991 pre1992