Question

我有一个字符数组，用于保存数据框中行的列名和值。不幸的是，如果特定条目的值为零，则列名和值不会列在数组中。我使用这些信息创建了我想要的数据框，但我依赖于“for循环”。

我想利用plyr来避免下面工作代码中的for循环。

types <- c("one", "two", "three")      # My data
entry <- c("one(1)", "three(2)")       # My data


values <- function(entry, types)
{
  frame<- setNames(as.data.frame(matrix(0, ncol = length(types), nrow = 1)), types)

  for(s1 in 1:length(entry))
  {
     name <- gsub("\\(\\w*\\)", "", entry[s1])                      # get name
     quantity <- as.numeric(unlist(strsplit(entry[s1], "[()]"))[2]) # get value

     frame[1, which(colnames(frame)==name)] <- quantity             # store

   }
   return(frame)
 }

 values(entry, types)                # This is how I want the output to look

我已经尝试过以下方法来拆分数组，但是我无法弄清楚如何让adply返回单行。

types <- c("one", "two", "three")        # data
entry <- c("one(1)", "three(2)")         # data

frame<- setNames(as.data.frame(matrix(0, ncol = length(types), nrow = 1)), types)    

array_split <- function(entry, frame){

  name <- gsub("\\(\\w*\\)", "", entry)                         # get name
  quantity <- as.numeric(unlist(strsplit(entry, "[()]"))[2])    # get value
  frame[1, which(colnames(frame)==name)] <- quantity            # store
  return(frame)
}

adply(entry, 1, array_split, frame)

我应该考虑使用像cumsum这样的东西吗？我想快速完成操作。

Answer 1

我不确定你为什么不做这样的事情：

frame <- setNames(rep(0,length(types)),types)
a <- as.numeric(sapply(strsplit(entry,"[()]"),`[[`,2))
names(a) <- gsub("\\(\\w*\\)", "", entry)
frame[names(a)] <- a

gsub和strsplit都已经过矢量化，因此无需在任何地方进行显式循环。您只需要sapply来提取strsplit结果的第二个元素。其余的只是定期索引。

用于循环到plyr函数

1 个答案: