我有一个充满光谱文件的文件夹。文件的数量可以根据不同的测量值和重复而变化。 到目前为止,它有效:
files <- list.files(pattern = "^Q\\d+")
print(files)
和print(list)
给出:
[1] "Q010101N.001" "Q010101N.002" "Q010101N.003" "Q010101N.004" "Q010101N.005" "Q010101N.006" [7] "Q010101N.007" "Q010101N.008" "Q010101N.009" "Q010101N.010" "Q010101N.011" "Q010101N.012" [13] "Q010101N.013" "Q010101N.014" "Q010101N.015" "Q010101N.016" "Q010101N.017" "Q010101N.018" [19] "Q010101N.019" "Q010101N.020" "Q010101N.021" "Q010101N.022" "Q010101N.023" "Q010101N.024" [25] "Q010101N.025" "Q021101N.001" "Q021101N.002" "Q021101N.003" "Q021101N.004" "Q021101N.005" [31] "Q021101N.006" "Q021101N.007" "Q021101N.008" "Q021101N.009" "Q021101N.010" "Q021101N.011" [37] "Q021101N.012" "Q021101N.013" "Q021101N.014" "Q021101N.015" "Q021101N.016" "Q021101N.017" [43] "Q021101N.018" "Q021101N.019" "Q021101N.020" "Q021101N.021" "Q021101N.022" "Q021101N.023" [49] "Q021101N.024" "Q021101N.025" "Q031201N.001" "Q031201N.002" "Q031201N.003" "Q031201N.004" [55] "Q031201N.005" "Q031201N.006" "Q031201N.007" "Q031201N.008" "Q031201N.009" "Q031201N.010" [61] "Q031201N.011" "Q031201N.012" "Q031201N.013" "Q031201N.014" "Q031201N.015" "Q031201N.016" [67] "Q031201N.017" "Q031201N.018" "Q031201N.019" "Q031201N.020" "Q031201N.021" "Q031201N.022" [73] "Q031201N.023" "Q031201N.024" "Q031201N.025" "Q041301N.001" "Q041301N.002" "Q041301N.003" [79] "Q041301N.004" "Q041301N.005" "Q041301N.006" "Q041301N.007" "Q041301N.008" "Q041301N.009" [85] "Q041301N.010" "Q041301N.011" "Q041301N.012" "Q041301N.013" "Q041301N.014" "Q041301N.015" [91] "Q041301N.016" "Q041301N.017" "Q041301N.018" "Q041301N.019" "Q041301N.020" "Q041301N.021" [97] "Q041301N.022" "Q041301N.023" "Q041301N.024" "Q041301N.025" "Q051401N.001" "Q051401N.002" [103] "Q051401N.003" "Q051401N.004" "Q051401N.005" "Q051401N.006" "Q051401N.007" "Q051401N.008" [109] "Q051401N.009" "Q051401N.010" "Q051401N.011" "Q051401N.012" "Q051401N.013" "Q051401N.014" [115] "Q051401N.015" "Q051401N.016" "Q051401N.017" "Q051401N.018" "Q051401N.019" "Q051401N.020" [121] "Q051401N.021" "Q051401N.022" "Q051401N.023" "Q051401N.024" "Q051401N.025" "Q061501N.001" [127] "Q061501N.002" "Q061501N.003" "Q061501N.004" "Q061501N.005" "Q061501N.006" "Q061501N.007" [133] "Q061501N.008" "Q061501N.009" "Q061501N.010" "Q061501N.011" "Q061501N.012" "Q061501N.013" [139] "Q061501N.014" "Q061501N.015" "Q061501N.016" "Q061501N.017" "Q061501N.018" "Q061501N.019" [145] "Q061501N.020" "Q061501N.021" "Q061501N.022" "Q061501N.023" "Q061501N.024" "Q061501N.025" [151] "Q071601N.001" "Q071601N.002" "Q071601N.003" "Q071601N.004" "Q071601N.005" "Q071601N.006" [157] "Q071601N.007" "Q071601N.008" "Q071601N.009" "Q071601N.010" "Q071601N.011" "Q071601N.012" [163] "Q071601N.013" "Q071601N.014" "Q071601N.015" "Q071601N.016" "Q071601N.017" "Q071601N.018" [169] "Q071601N.019" "Q071601N.020" "Q071601N.021" "Q071601N.022" "Q071601N.023" "Q071601N.024" [175] "Q071601N.025" "Q081701N.001" "Q081701N.002" "Q081701N.003" "Q081701N.004" "Q081701N.005" [181] "Q081701N.006" "Q081701N.007" "Q081701N.008" "Q081701N.009" "Q081701N.010" "Q081701N.011" [187] "Q081701N.012" "Q081701N.013" "Q081701N.014" "Q081701N.015" "Q081701N.016" "Q081701N.017" [193] "Q081701N.018" "Q081701N.019" "Q081701N.020" "Q081701N.021" "Q081701N.022" "Q081701N.023" [199] "Q081701N.024" "Q081701N.025" "Q091801N.001" "Q091801N.002" "Q091801N.003" "Q091801N.004" [205] "Q091801N.005" "Q091801N.006" "Q091801N.007" "Q091801N.008" "Q091801N.009" "Q091801N.010" [211] "Q091801N.011" "Q091801N.012" "Q091801N.013" "Q091801N.014" "Q091801N.015" "Q091801N.016" [217] "Q091801N.017" "Q091801N.018" "Q091801N.019" "Q091801N.020" "Q091801N.021" "Q091801N.022" [223] "Q091801N.023" "Q091801N.024" "Q091801N.025" "Q101901N.001" "Q101901N.002" "Q101901N.003" [229] "Q101901N.004" "Q101901N.005" "Q101901N.006" "Q101901N.007" "Q101901N.008" "Q101901N.009" [235] "Q101901N.010" "Q101901N.011" "Q101901N.012" "Q101901N.013" "Q101901N.014" "Q101901N.015" [241] "Q101901N.016" "Q101901N.017" "Q101901N.018" "Q101901N.019" "Q101901N.020" "Q101901N.021" [247] "Q101901N.022" "Q101901N.023" "Q101901N.024" "Q101901N.025" "Q112001N.001" "Q112001N.002" [253] "Q112001N.003" "Q112001N.004" "Q112001N.005" "Q112001N.006" "Q112001N.007" "Q112001N.008" [259] "Q112001N.009" "Q112001N.010" "Q112001N.011" "Q112001N.012" "Q112001N.013" "Q112001N.014" [265] "Q112001N.015" "Q112001N.016" "Q112001N.017" "Q112001N.018" "Q112001N.019" "Q112001N.020" [271] "Q112001N.021" "Q112001N.022" "Q112001N.023" "Q112001N.024" "Q112001N.025" "Q124101N.001" [277] "Q124101N.002" "Q124101N.003" "Q124101N.004" "Q124101N.005" "Q124101N.006" "Q124101N.007" [283] "Q124101N.008" "Q124101N.009" "Q124101N.010" "Q124101N.011" "Q124101N.012" "Q124101N.013" [289] "Q124101N.014" "Q124101N.015" "Q124101N.016" "Q124101N.017" "Q124101N.018" "Q124101N.019" [295] "Q124101N.020" "Q124101N.021" "Q124101N.022" "Q124101N.023" "Q124101N.024" "Q124101N.025" [301] "Q134201N.001" "Q134201N.002" "Q134201N.003" "Q134201N.004" "Q134201N.005" "Q134201N.006" [307] "Q134201N.007" "Q134201N.008" "Q134201N.009" "Q134201N.010" "Q134201N.011" "Q134201N.012" [313] "Q134201N.013" "Q134201N.014" "Q134201N.015" "Q134201N.016" "Q134201N.017" "Q134201N.018" [319] "Q134201N.019" "Q134201N.020" "Q134201N.021" "Q134201N.022" "Q134201N.023" "Q134201N.024" [325] "Q134201N.025" "Q144301N.001" "Q144301N.002" "Q144301N.003" "Q144301N.004" "Q144301N.005" [331] "Q144301N.006" "Q144301N.007" "Q144301N.008" "Q144301N.009" "Q144301N.010" "Q144301N.011" [337] "Q144301N.012" "Q144301N.013" "Q144301N.014" "Q144301N.015" "Q144301N.016" "Q144301N.017" [343] "Q144301N.018" "Q144301N.019" "Q144301N.020" "Q144301N.021" "Q144301N.022" "Q144301N.023" [349] "Q144301N.024" "Q144301N.025" "Q154401N.001" "Q154401N.002" "Q154401N.003" "Q154401N.004" [355] "Q154401N.005" "Q154401N.006" "Q154401N.007" "Q154401N.008" "Q154401N.009" "Q154401N.010" [361] "Q154401N.011" "Q154401N.012" "Q154401N.013" "Q154401N.014" "Q154401N.015" "Q154401N.016" [367] "Q154401N.017" "Q154401N.018" "Q154401N.019" "Q154401N.020" "Q154401N.021" "Q154401N.022" [373] "Q154401N.023" "Q154401N.024" "Q154401N.025" "Q164501N.001" "Q164501N.002" "Q164501N.003" [379] "Q164501N.004" "Q164501N.005" "Q164501N.006" "Q164501N.007" "Q164501N.008" "Q164501N.009" [385] "Q164501N.010" "Q164501N.011" "Q164501N.012" "Q164501N.013" "Q164501N.014" "Q164501N.015" [391] "Q164501N.016" "Q164501N.017" "Q164501N.018" "Q164501N.019" "Q164501N.020" "Q164501N.021" [397] "Q164501N.022" "Q164501N.023" "Q164501N.024" "Q164501N.025" "Q174601N.001" "Q174601N.002" [403] "Q174601N.003" "Q174601N.004" "Q174601N.005" "Q174601N.006" "Q174601N.007" "Q174601N.008" [409] "Q174601N.009" "Q174601N.010" "Q174601N.011" "Q174601N.012" "Q174601N.013" "Q174601N.014" [415] "Q174601N.015" "Q174601N.016" "Q174601N.017" "Q174601N.018" "Q174601N.019" "Q174601N.020" [421] "Q174601N.021" "Q174601N.022" "Q174601N.023" "Q174601N.024" "Q174601N.025"
因此,在这种情况下,我得到425个光谱文件,每个样本重复25次。然而,文件的总数可能在另一个时间不同,也可能是一个样本有10次重复,其余的例如14个。 所以我想将每个样本分组(重复一个子集)。在这种情况下,我会得到17个子集。 我需要导入文件,我之前已经成功完成了所有光谱文件:
list.data <- list()
#import all spectra files
for (i in 1:length(files))
list.data[[i]] <- read.csv(files[i])
鉴于我现在有子集,那会有些不同!?
答案 0 :(得分:0)
您可以通过辅助函数和迭代来完成此操作。我使用了dplyr
,purrr
和stringi
。这会将您的所有文件放入一个数据帧中。之后,您可以按照自己的意愿操纵它。
library(dplyr)
library(purrr)
library(stringi)
read_spectra <- function(file){
file_name <- basename(file)
read.csv(file) %>%
mutate(sample = stri_extract_first_regex(file_name, "([A-Z][0-9]+)(?=.)"),
repetition = stri_extract_first_regex(file_name, "(?<=\\.)(\\d+)")) %>%
select(sample, repetition, everything())
}
full_data <- map_df(files, read_spectra)
辅助函数:
list.files
获取文件。迭代使用map_df()
中的purrr
对read_spectra
中的每个文件进行迭代files
,并将所有这些文件绑定到一个数据框中。