Question

请考虑以下代码：

vectorize.me = function(history, row.idx=1, row.val=0, max=100){
  while (row.idx < max & row.val < max) {
    row.idx <- row.idx + 1
    entry <- paste('row.idx: ', row.idx, ' row.val: ', row.val)
    history[row.idx] <- entry
    print(entry)
  }
  return(history)
}

max <- 100
history <- vectorize.me(vector('list', max), max=max)

我想做以下事情：

我不想传递row.idx和row.val个参数，而是希望将数据框传递给vectorize.me函数，并让函数对每行idx和row val进行操作数据框。
删除while循环，只需在满足条件时停止迭代。
完成迭代后返回history列表。

我该怎么办？

df <- data.frame(sample(0:100,1000,rep=TRUE))
history <- vectorize.me(df, vector('list', max), max=max)

编辑：这是一个完全人为的例子。我设计了它，因为我想要一些示例代码，它将值传递给下一个＆＃34;迭代＆＃34;在矢量化代码中（即应用，lapply，mapply等）

Answer 1

您可以对一系列零和1使用cumprod，以便在原始系列中遇到第一个零值时获得一个变为0的系列。这可用于限制history的长度和要打印的项目。

不是作为一个函数而只是简单的代码：

df <- data.frame(ids=seq(1,1000),val=sample(0:100,1000,rep=TRUE))
valmax<-80
pyn<-cumprod(df$val<valmax)
history<-paste("row.idx",df$ids[pyn>0],"row.val",df$val[pyn>0])
print(history)

您可能需要添加一些检查和条件才能将其转换为良好的代码，但原则上这可以解决问题

Answer 2

以下内容如何：

vectorize.me <- function(df, var, history, max=100) {
  #-- Compute the max index in df to process (this is the "stopping condition" of the "loop")
  # Find the occurrence of the first index in df[,var] that is larger than 'max'
  # (note the fictitious FALSE and TRUE values added to the condition on df[,var]
  # in order to consider boundary conditions in one go)
  indmax <- min( which( c(FALSE, !df[,var] <= max, TRUE) ) ) - 2

  if (indmax > 0) { # There is at least one index to process
    # Limit indmax to the length of 'history'
    indmax <- min(indmax, length(history))
    ind <- 1:indmax
    entries <- paste('idx:', ind, 'val:', df[ind,var])
    history[ind] <- entries
    print(entries)
  }

  return(history)
}

#-- Test
# Test data
df <- data.frame(x=c(5, 8, 9, 8, 10, 4, 1, 3))

# Run tests
history <- vector('list', 8)
history <- vectorize.me(df, "x", history, max=8)   # first 'max' value is found in a middle row
history <- vectorize.me(df, "x", history, max=4)   # first value in data frame is larger than 'max'
history <- vectorize.me(df, "x", history, max=max(df[,"x"]))      # all values in data frame are <= 'max'
history <- vectorize.me(df, "x", history, max=max(df[,"x"]) + 1)  # 'max' is larger than the maximum value in df[,var]
history <- vector('list', 6)
history <- vectorize.me(df, "x", history, max=max(df[,"x"]))      # 'history' is shorter than the maximum index of df to process

注意：

参数var指定数据框中应用max条件的列的名称。
未对输入参数的有效性进行检查

如何对while循环进行矢量化？

2 个答案: