将值分配给R中的特定列和行

时间:2015-12-30 03:19:51

标签: r

我有一个数据框如下。

Name Amount Subscriptionperiod(Months)  Subscriptionstart (Month)
Tom  300    3                             0
Tom  100    3                             2
Jim  500    5                             0
Jim  600    3                             1

我想安排如下数据。例如,汤姆在一次交易中支付300美元3个月。在2个月后的第二次交易中,他又花了3个月额外支付100美元。

同样对吉姆。

Name   M0   M1  M2   M3  M4  M5  M6
Tom    300 300  300  0   0   0   0
Tom    0    0   100  100 100 0   0
Jim    500 500  500  500 500 0   0
Jim    0   600  600  600 0   0   0

我无法改造。我使用下面的代码来完成第一部分。但是如何在Jim的情况下创建第二行,其中值从M2开始。 100美元从M2开始并持续到M4。

for(i in 0:6) df <- within(df,assign(paste0("M",i),ifelse((Subscriptionperiod>i),amount,0)))

上面的代码给出了以下输出,这不是我想要的。帮助会很感激。

     Name   M0   M1  M2   M3  M4  M5  M6
    Tom    300 300  300  0   0   0   0
    Tom    100  100 100  0   0   0   0
    Jim    500 500  500 500 500  0   0 
    Jim    600  600  600 0   0   0   0

3 个答案:

答案 0 :(得分:2)

text1="Name Amount Subscriptionperiod(Months)  Subscriptionstart(Month)
Tom  300    3                             0
Tom  100    3                             2
Jim  500    5                             0
Jim  600    3                             1"

df1 <- read.table(text=text1, head=T, as.is=T)

df2.lst <- apply(df1, 1, function(x){
  times <- x[3]
  lst <- lapply(1:times, function(i){return(x[1:2])})
  df.lst <- as.data.frame(do.call(rbind, lst))
  df.lst$mon <- paste0("M", seq(from=as.numeric(x[4]), 
                            length.out = as.numeric(x[3])))
  return(df.lst)
})

df2 <- do.call(rbind, df2.lst)

library(reshape2)
df2$mon <- factor(df2$mon, levels = paste0("M", 0:6))
df3 <- dcast(df2, Name+Amount~mon, value.var = "Amount")
df3$Amount <- NULL
df3
#   Name   M0   M1  M2   M3   M4
# 1  Tom  300  300 300 <NA> <NA>
# 2  Tom <NA> <NA> 100  100  100
# 3  Jim  500  500 500  500  500
# 4  Jim <NA>  600 600  600 <NA>

答案 1 :(得分:2)

首先,让我们从您的最小数据框开始:

df1 <- data.frame(name=c("Tom", "Tom", "Jim", "Jim"), amount=c(300, 100, 500, 600), 
                  Subperiod=c(3, 3, 5, 3), SubStart = c(0, 2, 0, 1))

> df1
  name amount Subperiod SubStart
1  Tom    300         3        0
2  Tom    100         3        2
3  Jim    500         5        0
4  Jim    600         3        1

接下来,实例化一个空矩阵,其中ncol等于你想要显示的月数:

m <- matrix(0, nrow=4, ncol=7)

现在聪明的部分 - 创建一个函数,创建一个大的矢量,根据你的规则填充矩阵

special_spread <- function(df1){
    bigrow <- c()
    for(i in 1:nrow(df1)){
      pt1 <- rep(0, df1$SubStart[i])
      pt2 <- rep(df1$amount[i], df1$Subperiod[i])
      pt3 <- rep(0, ncol(m) - (length(pt2)+length(pt1)) )
      bigrow <- c(bigrow, pt1, pt2, pt3)
    }
    m1 <- as.data.frame(matrix(bigrow, nrow=4, ncol=7, byrow = TRUE))
    m1 <- cbind(df1$name, m1)
    colnames(m1) <- c("name", paste0("M", 0:6))
    return(m1)
}

> special_spread(df1)
  name  M0  M1  M2  M3  M4 M5 M6
1  Tom 300 300 300   0   0  0  0
2  Tom   0   0 100 100 100  0  0
3  Jim 500 500 500 500 500  0  0
4  Jim   0 600 600 600   0  0  0

请告诉我这是否需要更多解释,或多或少地回答你的问题。

答案 2 :(得分:0)

我使用了data.table和plyr包来实现这一目标。

library(data.table)
library(plyr)

df <- data.table(read.table(text='Name Amount Period  Start
                                   Tom  300    3      0
                                   Tom  100    3      2
                                   Jim  500    5      0
                                   Jim  600    3      1', header=T, row.names = NULL))
#Create a row by repeating df$Amount, df$Period times and padding with 0
create_rows <- function(x, y){
  c(rep(0, x$Start), rep(x$Amount, x$Period), rep(0, y - x$Period - x$Start))
}

#Create a new data.table and add Name column
df2 <- data.table(Name = df$Name)

#Create an array of month names
months <- paste('M', 0:6, sep = '')

#Use adply (from plyr) to apply create_rows() accross all rows of df2
#.expand = FALSE ensures the size of the returned data.frame is the right size
#.id = NULL stops adply() from creating an index column
#with = FALSE allows the variable month to be used to refer to columns
df2[, months := adply(df, 
                      1, create_rows,
                      length(months), 
                      .expand = FALSE, 
                      .id =NULL),
    with = FALSE]