R - 循环cbind()的累积存储结果和可能的lapply解决方案到双循环

时间:2017-09-07 09:31:44

标签: r for-loop cbind

根据此代码提供的@ Ryan建议,我发现了一个针对question的解决方案的解决方案:

for (i in seq_along(url)){

  webpage <- read_html(url[i]) #loop through URL list to access html data

  fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
  fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
  fac_data <- c(fac_data, fac_data1) #Store table data on each URL in a variable 

  x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data

  for (j in seq_along(headers[[i]])){
    y <- cbind(x[,j]) #extract column data and store in temporary variable
    colnames(y) <- as.character(headers[[i]][j]) #add column name
    print(cbind(y)) #loop through headers list to print column data in sequence. ** cbind(y) will be overwritten when I try to store the result on a list with 'z <- cbind(y)'.
  }
}

我现在能够打印出所有值,并填写相关数据的标题。

一些后续问题将是:

  1. 如何在data.frame或列表中累计保存cbind(y)的输出?循环通过cbind(y)将覆盖值,这使我只有最后一个表的最后一列。像这样:

    退休年月

    [1,]&#34; 82年8月&#34;

  2. 这些变化都不起作用:

    z[[x]][j] <- cbind(y)
    
    > source('~/Google 云端硬盘/R/scrapeFaculty.R')
    Error in `*tmp*`[[x]] : 最多只能選擇一個元素
    
    z[j] <- cbind(y)
    
    > source('~/Google 云端硬盘/R/scrapeFaculty.R')
    There were 13 warnings (use warnings() to see them)
    
    z[[j]] <- cbind(y)
    
    > source('~/Google 云端硬盘/R/scrapeFaculty.R')
    Error in z[[j]] <- cbind(y) : 用來替換的元素比所要替換的值多
    
    1. 可以用简单的lapply()函数替换双for循环 解决上述问题?
    2. 编辑:

      这是我用来解决这个问题的最终代码:

      for (i in seq_along(url)){
      
        webpage <- read_html(url[i])
      
        fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
        fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
        fac_data <- c(fac_data, fac_data1)
      
        x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data
        y <- cbind(x[,1:length(headers[[i]])]) #extract column data
        colnames(y)<- as.character(headers[[i]]) #add colunm name
        ntu.hist[[i]] <- y #Cumulate results on a list.
      
      }
      

2 个答案:

答案 0 :(得分:0)

我想知道是否可以同时选择多个cbind而不是循环。这些语法选项中的任何一个都有帮助吗?

y <– data.frame(col1=c(1:3),col2=c(4:6),col3=c(7:9))

cbind(y[,c(1:3)])

  col1 col2 col3
1    1    4    7
2    2    5    8
3    3    6    9

#In R, you can use ":" to specify a range. So 1,2,3,4 is equal to 1:4.
#If you don't want number 3 in that range, you can use c(1,2,4).

#For example:

cbind(y[,c(1,3)])

  col1  col3
1    1     7
2    2     8
3    3     9

答案 1 :(得分:0)

最终代码:

这是最终的代码:

for (i in seq_along(url)){

  webpage <- read_html(url[i])

  fac_data <- html_nodes(webpage,'.tableunder')  %>% html_text()
  fac_data1 <- html_nodes(webpage,'.tableunder1')  %>% html_text()
  fac_data <- c(fac_data, fac_data1)

  x <- fac_data %>% matrix(ncol = length(headers[[i]]), byrow=TRUE) #make matrix to extract column data
  y <- cbind(x[,1:length(headers[[i]])]) #extract column data
  colnames(y)<- as.character(headers[[i]]) #add colunm name
  ntu.hist[[i]] <- y #Cumulate results on a list.

}