我正在研究一个R脚本,该脚本为keras框架的flow_from_directory函数准备了图像目录。在该目录中,图像被分隔在数据具有的目标类别中,例如用于猫或狗的文件夹,如以下结构所示。
data-|
| +-train
| | +-dog
| | +-cat
| +-validation
| | +-dog
| | +-cat
| +-test
| | +-dog
| | +-cat
我的数据已经存在于文件夹训练,测试和验证中。图像的类是数字,范围是0到7,并存储在txt文件中。要知道使用了文件名的图像属于哪个txt文件,例如1.png <-> 1.txt->“ 0”。 我的第一个尝试是直接了解我已经知道的R。但是我读到R中的for循环并不是很常用,所以我尝试了另一种方法。该脚本中缺少的是每个子目录上的最后一个循环。
image_postfix <- ".png"
label_postfix <- ".txt"
# Base directory
data_directory <- "C:/data/deep_learning - Kopie/aufgabe_1"
# Subdirectorys with the data
data_subdir <- c("train", "validation", "test")
# Resulting directory with only the catagorical folders
keras_directory <- "image_category"
# get image files from the train directory
image_files <- list.files(file.path(data_directory,data_subdir[1]), pattern = image_postfix)
# get txt files from the train directory
label_files <- list.files(file.path(data_directory,data_subdir[1]), pattern = label_postfix)
# get get the plain file names
file_name <- gsub( pattern = "(.*)\\..*", replacement = "\\1", label_files)
# reade each txt file in the directory
labels <- mapply(readChar,file.path(data_directory,data_subdir[1],label_files),nchar = 1)
names(labels) <- file_name
# create a data frame
train_data <- data.frame(image_files, label_files, labels)
# get a list with the classes
classes <- levels(train_data$labels)
# create the subdirectory for keras
dir.create(file.path(data_directory,data_subdir[1],keras_directory), showWarnings = FALSE)
# create the directory for each class
lapply(file.path(data_directory,data_subdir[1],keras_directory, classes), dir.create, showWarnings = FALSE)
# loop over each class
for (idx in classes) {
# get the data for that class
train_data_by_label <- subset(train_data, train_data$labels == idx)
# and copy the image file in the resulting directory
file.copy(
file.path(data_directory,data_subdir[1],train_data_by_label$image_file),
file.path(data_directory,data_subdir[1],keras_directory,idx,train_data_by_label$image_file)
)
}
我现在得到的是以下代码。
image_postfix <- ".png"
label_postfix <- ".txt"
# Base Directory
data_directory <- "C:/data/deep_learning - Kopie/aufgabe_1"
# Subdirectorys for the data
data_subdir <- c("train", "validation", "test")
keras_directory <- "image_category"
# creates a list with each subdirectory as name and its including images
image_files <- mapply(list.files,file.path(data_directory,data_subdir), pattern = image_postfix)
# creates a list with each subdirectory as name and its including txt files
label_files <- mapply(list.files,file.path(data_directory,data_subdir), pattern = label_postfix)
# creates a list with each subdirectory as name and its resulting file names
file_names <- lapply(X = label_files,FUN = gsub, pattern = "(.*)\\..*", replacement = "\\1")
# now i'm out of knowlage about R
# Read each file in labels_files with the name as path and that including files
Data structure for image_files, label_files and file_names
感谢您的所有建议!