Question

我正在使用的数据集是按客户和月份计费的数据。最后，我想制作一个数据框，其中包含列名称的行数和月数的客户ID - 与原始数据集一样。但是，我希望这个新数据集包含虚拟变量，以确定客户是否已获得＆＃34;那个月又名。他们之前从未收到过账单，那个月是他们第一次被收费。

这是一个可重现的例子，以及我现在写的循环：

set.seed(24)
example.data <- data.frame(
   ID = sample(11:20),
   Jan = sample(0:5, 10, replace = TRUE),
   Feb = sample(0:5, 10, replace = TRUE),
   Mar = sample(0:5, 10, replace = TRUE),
   Apr = sample(0:5, 10, replace = TRUE)
)
gained.df.ex <- data.frame(example.data$ID)

## customers can't be gained in the first month
## there's no previous data to verify that this is the first time they've been billed, so all values are 0

gained.df.ex$Jan <- rep(0, length(example.data$ID)

## here's the loop that isn't working

for(i in 3:5){
   new.month.dummy <- for (x in 1:length(gained.df.ex$example.data.ID)){
      ifelse(example.data[x,i] == 0, new.month.dummy[x] <- 0, ifelse(sum(example.data[x,2:(i-1)]} == 0, new.month.dummy[x] <-1, new.month.dummy <- 0))
}

我确定通过申请可以做到这一点，但我不确定如何。

预期输出如下：

> example.data
   Jan Feb Mar Apr
15   0   3   4   3
19   1   3   0   5
20   4   2   5   1
12   2   1   3   0
14   0   0   2   1
17   5   5   4   4
11   3   4   1   5
18   1   0   0   2
13   3   2   5   3
16   2   5   1   2

> gained.df.ex
   Jan Feb Mar Apr
15   0   1   0   0
19   0   0   0   0
20   0   0   0   0
12   0   0   0   0
14   0   0   1   0
17   0   0   0   0
11   0   0   0   0
18   0   0   0   0
13   0   0   0   0
16   0   0   0   0

Answer 1

我们可以尝试

div.row

在循环中创建新列或应用

1 个答案: