Question

我正在研究一个R脚本，该脚本为keras框架的flow_from_directory函数准备了图像目录。在该目录中，图像被分隔在数据具有的目标类别中，例如用于猫或狗的文件夹，如以下结构所示。

data-|
|    +-train
|    |     +-dog
|    |     +-cat
|    +-validation
|    |     +-dog
|    |     +-cat
|    +-test
|    |     +-dog
|    |     +-cat

我的数据已经存在于文件夹训练，测试和验证中。图像的类是数字，范围是0到7，并存储在txt文件中。要知道使用了文件名的图像属于哪个txt文件，例如1.png <-> 1.txt->“ 0”。我的第一个尝试是直接了解我已经知道的R。但是我读到R中的for循环并不是很常用，所以我尝试了另一种方法。该脚本中缺少的是每个子目录上的最后一个循环。

image_postfix <- ".png"
label_postfix <- ".txt"


# Base directory     
data_directory <- "C:/data/deep_learning - Kopie/aufgabe_1"
# Subdirectorys with the data 
data_subdir <- c("train", "validation", "test")
# Resulting directory with only the catagorical folders
keras_directory <- "image_category"

# get image files from the train directory
image_files <- list.files(file.path(data_directory,data_subdir[1]), pattern = image_postfix)
# get txt files from the train directory
label_files <- list.files(file.path(data_directory,data_subdir[1]), pattern = label_postfix)
# get get the plain file names
file_name <- gsub( pattern = "(.*)\\..*", replacement = "\\1", label_files)

# reade each txt file in the directory
labels <- mapply(readChar,file.path(data_directory,data_subdir[1],label_files),nchar = 1)

names(labels) <- file_name

# create a data frame
train_data <- data.frame(image_files, label_files, labels)

# get a list with the classes 
classes <- levels(train_data$labels)

# create the subdirectory for keras
dir.create(file.path(data_directory,data_subdir[1],keras_directory), showWarnings = FALSE)
# create the directory for each class
lapply(file.path(data_directory,data_subdir[1],keras_directory, classes), dir.create, showWarnings = FALSE)

# loop over each class
for (idx in classes) {
    # get the data for that class
    train_data_by_label <- subset(train_data, train_data$labels == idx)
    # and copy the image file in the resulting directory  
    file.copy(
         file.path(data_directory,data_subdir[1],train_data_by_label$image_file),
         file.path(data_directory,data_subdir[1],keras_directory,idx,train_data_by_label$image_file)
    )
}

我现在得到的是以下代码。

image_postfix <- ".png"
label_postfix <- ".txt"

# Base Directory
data_directory <- "C:/data/deep_learning - Kopie/aufgabe_1"
# Subdirectorys for the data
data_subdir <- c("train", "validation", "test")

keras_directory <- "image_category"

# creates a list with each subdirectory as name and its including images  
image_files <- mapply(list.files,file.path(data_directory,data_subdir), pattern = image_postfix)

# creates a list with each subdirectory as name and its including txt files  
label_files <- mapply(list.files,file.path(data_directory,data_subdir), pattern = label_postfix)

# creates a list with each subdirectory as name and its resulting file names
file_names <- lapply(X = label_files,FUN = gsub,  pattern = "(.*)\\..*", replacement = "\\1")

# now i'm out of knowlage about R 

# Read each file in labels_files with the name as path and that including files

Data structure for image_files, label_files and file_names

感谢您的所有建议！

R：为image_flow_from_directory准备keras映像目录

0 个答案: