如何强制Matlab连续读取文件夹中的文件?

时间:2014-08-15 16:56:40

标签: matlab directory dir

我的文件夹中的文件编号从writer_1writer_20。我写了一个代码来读取所有文件并将它们存储在单元格中。但问题是文件不是串行读取的。

folders = dir(Path_training);
folders(ismember( {folders.name}, {'.', '..'}) ) = []; %Remove these two from list
training = [];
    for i = 1:length(folders)
        current_folder = [Path_training folders(i).name '\']; 
.
.
.
.
.

此处folders(1).name是writer_1,folders(2).name是writer_10

我知道dir会像资源管理器那样返回结果,但有没有办法强制它以数字形式出现?

我根据这些数字训练SVM,这个问题让人很困难。

4 个答案:

答案 0 :(得分:2)

我不知道您遇到的任何问题的直接解决方案。 我找到了一个问题类似的解决方案,here

List = dir('*.png');
Name = {List.name};
S = sprintf('%s,', Name{:});    % '10.png,100.png,1000.png,20.png, ...'
D = sscanf(S, '%d.png,');       % [10; 100, 1000; 20; ...]
[sortedD, sortIndex] = sort(D); % Sort numerically
sortedName = Name(sortIndex);   % Apply sorting index to original names

差异是:

  1. 您正在处理目录而不是文件
  2. 除了数字
  3. 之外,您的目录中还有其他文字

答案 1 :(得分:1)

方法#1

%// From your code
folders = dir(Path_training);
folders(ismember( {folders.name}, {'.', '..'}) ) = []

%// Convert folders struct to a cell array with all of the data from dir
folders_cellarr = struct2cell(folders)

%// Get filenames
fn = folders_cellarr(1,:)

%// Get numeral part and sorted indices
num=str2double(cellfun(@(x) strjoin(regexp(x,['\d'],'match'),''), fn(:),'uni',0))
[~,ind] = sort(num)

%// Re-arrange folders based on the sorted indices
folders = folders(ind)

方法#2

如果您想避免使用struct2cell,可采用其他方法 -

%// Get filenames
fn = cellstr(ls(Path_training))
fn(ismember(fn,{'.', '..'}))=[]

%// Get numeral part and sorted indices
num=str2double(cellfun(@(x) strjoin(regexp(x,['\d'],'match'),''), fn(:),'uni',0))
[~,ind] = sort(num)

%// List directory and re-arrange the elements based on the sorted indices
folders = dir(Path_training);
folders(ismember( {folders.name}, {'.', '..'}) ) = []
folders = folders(ind)

请注意, strjoin 是MATLAB Toolbox的最新成员。因此,如果您使用的是较旧版本的MATLAB,请参阅MATLAB文件交换中的source code link

答案 2 :(得分:1)

这是一种略有不同的方式(编辑修复bug并实现@Divakar的建议以消除for循环)

folders = dir(Path_training);
folders(ismember( {folders.name}, {'.', '..'}) ) = [];

%// Get folder numbers as cell array of strings
folder_nums_cell = regexp({folders.name}, '\d*', 'match');

%// Convert cell array to vector of numbers
folder_nums = str2double(vertcat(folder_nums_cell{:}));

%// Sort original folder array
[~,inds] = sort(folder_nums);
folders = folders(inds);

答案 3 :(得分:1)

从DavidS窃取一点,并假设您的文件夹都是“writer_XX”形式,XX是数字。

folders = dir([pwd '\temp']);
folders(ismember( {folders.name}, {'.', '..'}) ) = [];

% extract numbers from cell array
foldersNumCell = regexp({folders.name}, '\d*', 'match');

% convert from cell array of strings to double
foldersNumber = str2double(foldersNumCell);

% get sort order
[garbage,sortI] = sort(foldersNumber);

% rearrange the structure
folders = folders(sortI);

这样做的好处是避免了for循环。实际上,如果你有数万个文件夹,它只会有所不同。 (我创建了50,000个标记为'writer_1'的文件夹到'writer_50000'。执行时间的差异大约为1.2秒。