根据R中的选择条件从目录​​中读取所选文件

时间:2015-07-02 15:22:17

标签: r file concatenation paste

我想只读取文件夹中选定的.txt文件来构建一个巨大的表...我有超过9K的文件,并且想要导入具有所选距离和建筑类型的文件,这部分表示文件名。

例如,我想首先选择名称中包含“_U0”和“_0_Final.txt”的文件:

Type = c(0,1)
D3Test = 1
Distance = c(0,50,150,300,650,800)
D2Test = 1;

files <- list.files(path=data.folder, pattern=paste("*U", Type[D3Test],"*_",Distance[D2Test],"_Final.txt",sep=""))

但结果是空的...... 我的建筑有什么问题吗?

 filename <- scan(what="")
 "M10_F1_T1_D1_U0_H1_0_Final.txt"   "M10_F1_T1_D1_U0_H1_150_Final.txt" "M10_F1_T1_D1_U0_H1_300_Final.txt"
 "M10_F1_T1_D1_U0_H1_50_Final.txt"  "M10_F1_T1_D1_U0_H1_650_Final.txt" "M10_F1_T1_D1_U0_H1_800_Final.txt"
 "M10_F1_T1_D1_U0_H2_0_Final.txt"   "M10_F1_T1_D1_U0_H2_150_Final.txt" "M10_F1_T1_D1_U0_H2_300_Final.txt"
 "M10_F1_T1_D1_U0_H2_50_Final.txt"  "M10_F1_T1_D1_U0_H2_650_Final.txt" "M10_F1_T1_D1_U0_H2_800_Final.txt"
 "M10_F1_T1_D1_U0_H3_0_Final.txt"   "M10_F1_T1_D1_U0_H3_150_Final.txt" "M10_F1_T1_D1_U0_H3_300_Final.txt"
 "M10_F1_T1_D1_U0_H3_50_Final.txt"  "M10_F1_T1_D1_U0_H3_650_Final.txt" "M10_F1_T1_D1_U0_H3_800_Final.txt"
 "M10_F1_T1_D1_U1_H1_0_Final.txt"   "M10_F1_T1_D1_U1_H1_150_Final.txt" "M10_F1_T1_D1_U1_H1_300_Final.txt"
 "M10_F1_T1_D1_U1_H1_50_Final.txt"  "M10_F1_T1_D1_U1_H1_650_Final.txt" "M10_F1_T1_D1_U1_H1_800_Final.txt"

3 个答案:

答案 0 :(得分:2)

您应该查看传递给pattern的结果:

"*U0*_0_Final.txt"

它不会获取任何这些文件名。星号表示零个或多个&#34; 0&#34;介于&#34; U&#34;和下划线。如果文件名中的T和D未表示“类型”和“距离”,则会传递正确的模式:

grep( pattern=paste0("_U", Type[D3Test],".*_", Distance[D2Test],"_Final\\.txt"), filename)
#-----------
#[1]  1  7 13   So matches 3 filenames

请注意,您需要转义(带有两个反斜杠)您希望仅为句点的句点,因为句点是特殊字符。你还需要使用&#34;。*&#34;允许模式中的间隙。

答案 1 :(得分:2)

另一种方法是使用sprintfgrepl

x <- c("M10_F1_T1_D1_U0_H1_150_Final.txt", "M10_F1_T1_D1_U0_H2_650_Final.txt", "M10_F1_T1_D1_U1_H1_650_Final.txt")

x[grepl(sprintf("U%i_H%i_%i", 1, 1, 650), x)]

[1] "M10_F1_T1_D1_U1_H1_650_Final.txt"

答案 2 :(得分:0)

files <- list.files(path=data.folder, pattern=paste("*U", Type[D3Test], "....",Distance[D2Test], sep=""))

我修改了我的代码,这个有效!基本上,我们的想法是使用点来显示Type [D3Test]和Distance [D2Test]之间的每个字符,因为这两个字符之间的字符固定为4。

感谢: http://www.cheatography.com/davechild/cheat-sheets/regular-expressions/