循环URL R太多打开的文件

时间:2018-06-07 23:43:26

标签: r for-loop rstudio k-means rtvs

我在其中包含以下file个网址。想法是从URL下载图像,获得6色调色板,获取颜色名称和百分比,并将它们全部绑定在产品编号旁边的列表中。但我得到了太多的文件"错误。

library(readxl)
library(jpeg)
library(scales)
library(plotrix)
library(gridExtra)
library(dplyr)
library(data.table)
dataset = read_excel("C:/Temp/Product.xlsx", sheet = "All")
datalist = list()
nRowsDf <- nrow(dataset)
avector <- as.vector(dataset$URL)
varenummer <- as.vector(dataset$Varenr)
for (i in 1:nRowsDf) {  
  tryCatch({
#Convert this from Data.frame to Vector
Sku <- as.vector(varenummer[[i]])
download.file(avector[[i]], paste(Sku,".jpg" ,sep = ""), mode = "wb")
painting <- readJPEG(paste(Sku,".jpg" ,sep = ""))

dimension <- dim(painting)
painting_rgb <- data.frame(
  x = rep(1:dimension[2], each = dimension[1]),
  y = rep(dimension[1]:1, dimension[2]),
  R = as.vector(painting[,, 1]), #slicing array into RGB Channels
  G = as.vector(painting[,, 2]),
  B = as.vector(painting[,, 3])
)


k_means = kmeans(painting_rgb[, c("R", "G", "B")], algorithm = "Lloyd", centers = 6, iter.max = 300)
test = (sapply(rgb(k_means$centers), color.id))

Color = lapply(test, `[[`, 1)
Values = k_means$size
Percentage = k_means$size / sum(k_means$size)
Final = do.call(rbind, Map(data.frame, Color = lapply(test, `[[`, 1), Values = k_means$size, ProductNumber = Sku, Percentage = Percentage))
Final$i <- i #  iteration 
datalist[[i]] <- Final # add iteration to list
big_data = rbindlist(datalist)
#grid.table(big_data)
write.table(big_data, file = "myDF.csv", sep = ",", col.names = TRUE, append = TRUE)


#R = Final[with(Final, order(-Percentage)),]
}, error = function(e) { closeAllConnections() })
closeAllConnections() 

}

下载大约266个独特的JPEG图像后代码停止。

此代码仅下载JPG文件,如果返回其他文件类型,则会忽略它。

错误:

Error in file(file, ifelse(append, "a", "w")) : 
cannot open the connection
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
cannot open file 'myDF.csv': Too many open files

如果我删除了trycatch,我会得到这些:

Error in download.file(avector[[i]], "image.jpg", mode = "wb") : 
cannot open destfile 'image.jpg', reason 'Too many open files'

1 个答案:

答案 0 :(得分:0)

该代码有一个错误,或者最好说一个不必要的步骤,即保持打开的连接,直到达到“文件”强加的限制为止。

只需删除迭代步骤和rbind数据列表,它就可以完美运行。

修改后的版本。

for (i in 1:nRowsDf) {
tryCatch({
    #Convert this from Data.frame to Vector

    Sku <- as.vector(varenummer[[i]]) #for testing use 23406
    download.file(avector[[i]], paste(Sku, ".jpg", sep = ""), mode = "wb")
    # painting <- readJPEG(paste(Sku,".jpg" ,sep = ""))

    painting = load.image(paste(Sku, ".jpg", sep = ""))
    dimension <- dim(painting)
    painting_rgb <- data.frame(
  x = rep(1:dimension[2], each = dimension[1]),
  y = rep(dimension[1]:1, dimension[2]),
  R = as.vector(painting[,, 1]), #slicing our array into three
  G = as.vector(painting[,, 2]),
  B = as.vector(painting[,, 3])
)


    k_means = kmeans(painting_rgb[, c("R", "G", "B")], algorithm = "Lloyd", centers = 6, iter.max = 300)
test = (sapply(rgb(k_means$centers), color.id))

    Color = lapply(test, `[[`, 1)
Values = k_means$size
Percentage = k_means$size / sum(k_means$size)
Final = do.call(rbind, Map(data.frame, Color = lapply(test, `[[`, 1), Values =     k_means$size, ProductNumber = Sku, Percentage = Percentage))
    #Final$i <- i # maybe you want to keep track of which iteration produced it?
    #datalist[[i]] <- Final # add it to your list
    #big_data = rbindlist(datalist)
    #grid.table(big_data)
    write.table(Final, file = "myDF.csv", sep = ",", col.names = TRUE, append = TRUE)


    #R = Final[with(Final, order(-Percentage)),]
}, error = function(e) { closeAllConnections() })
 closeAllConnections()

}