我有一个数据框如下。
Name Amount Subscriptionperiod(Months) Subscriptionstart (Month)
Tom 300 3 0
Tom 100 3 2
Jim 500 5 0
Jim 600 3 1
我想安排如下数据。例如,汤姆在一次交易中支付300美元3个月。在2个月后的第二次交易中,他又花了3个月额外支付100美元。
同样对吉姆。
Name M0 M1 M2 M3 M4 M5 M6
Tom 300 300 300 0 0 0 0
Tom 0 0 100 100 100 0 0
Jim 500 500 500 500 500 0 0
Jim 0 600 600 600 0 0 0
我无法改造。我使用下面的代码来完成第一部分。但是如何在Jim的情况下创建第二行,其中值从M2开始。 100美元从M2开始并持续到M4。
for(i in 0:6) df <- within(df,assign(paste0("M",i),ifelse((Subscriptionperiod>i),amount,0)))
上面的代码给出了以下输出,这不是我想要的。帮助会很感激。
Name M0 M1 M2 M3 M4 M5 M6
Tom 300 300 300 0 0 0 0
Tom 100 100 100 0 0 0 0
Jim 500 500 500 500 500 0 0
Jim 600 600 600 0 0 0 0
答案 0 :(得分:2)
text1="Name Amount Subscriptionperiod(Months) Subscriptionstart(Month)
Tom 300 3 0
Tom 100 3 2
Jim 500 5 0
Jim 600 3 1"
df1 <- read.table(text=text1, head=T, as.is=T)
df2.lst <- apply(df1, 1, function(x){
times <- x[3]
lst <- lapply(1:times, function(i){return(x[1:2])})
df.lst <- as.data.frame(do.call(rbind, lst))
df.lst$mon <- paste0("M", seq(from=as.numeric(x[4]),
length.out = as.numeric(x[3])))
return(df.lst)
})
df2 <- do.call(rbind, df2.lst)
library(reshape2)
df2$mon <- factor(df2$mon, levels = paste0("M", 0:6))
df3 <- dcast(df2, Name+Amount~mon, value.var = "Amount")
df3$Amount <- NULL
df3
# Name M0 M1 M2 M3 M4
# 1 Tom 300 300 300 <NA> <NA>
# 2 Tom <NA> <NA> 100 100 100
# 3 Jim 500 500 500 500 500
# 4 Jim <NA> 600 600 600 <NA>
答案 1 :(得分:2)
首先,让我们从您的最小数据框开始:
df1 <- data.frame(name=c("Tom", "Tom", "Jim", "Jim"), amount=c(300, 100, 500, 600),
Subperiod=c(3, 3, 5, 3), SubStart = c(0, 2, 0, 1))
> df1
name amount Subperiod SubStart
1 Tom 300 3 0
2 Tom 100 3 2
3 Jim 500 5 0
4 Jim 600 3 1
接下来,实例化一个空矩阵,其中ncol
等于你想要显示的月数:
m <- matrix(0, nrow=4, ncol=7)
现在聪明的部分 - 创建一个函数,创建一个大的矢量,根据你的规则填充矩阵
special_spread <- function(df1){
bigrow <- c()
for(i in 1:nrow(df1)){
pt1 <- rep(0, df1$SubStart[i])
pt2 <- rep(df1$amount[i], df1$Subperiod[i])
pt3 <- rep(0, ncol(m) - (length(pt2)+length(pt1)) )
bigrow <- c(bigrow, pt1, pt2, pt3)
}
m1 <- as.data.frame(matrix(bigrow, nrow=4, ncol=7, byrow = TRUE))
m1 <- cbind(df1$name, m1)
colnames(m1) <- c("name", paste0("M", 0:6))
return(m1)
}
> special_spread(df1)
name M0 M1 M2 M3 M4 M5 M6
1 Tom 300 300 300 0 0 0 0
2 Tom 0 0 100 100 100 0 0
3 Jim 500 500 500 500 500 0 0
4 Jim 0 600 600 600 0 0 0
请告诉我这是否需要更多解释,或多或少地回答你的问题。
答案 2 :(得分:0)
我使用了data.table和plyr包来实现这一目标。
library(data.table)
library(plyr)
df <- data.table(read.table(text='Name Amount Period Start
Tom 300 3 0
Tom 100 3 2
Jim 500 5 0
Jim 600 3 1', header=T, row.names = NULL))
#Create a row by repeating df$Amount, df$Period times and padding with 0
create_rows <- function(x, y){
c(rep(0, x$Start), rep(x$Amount, x$Period), rep(0, y - x$Period - x$Start))
}
#Create a new data.table and add Name column
df2 <- data.table(Name = df$Name)
#Create an array of month names
months <- paste('M', 0:6, sep = '')
#Use adply (from plyr) to apply create_rows() accross all rows of df2
#.expand = FALSE ensures the size of the returned data.frame is the right size
#.id = NULL stops adply() from creating an index column
#with = FALSE allows the variable month to be used to refer to columns
df2[, months := adply(df,
1, create_rows,
length(months),
.expand = FALSE,
.id =NULL),
with = FALSE]