我有许多.csv格式的数据集,我用文件名标准组织,所以我可以第二次使用正则表达式。但是,我遇到了一个小问题。我的数据文件标题为" 2012001_C335_2000MHZ_P_1111.CSV"。有四年的兴趣,两个频率和四个不同的C335风格标签来描述位置。我对每个文件都进行了大量的数据处理,所以我想把它们全部读成一个巨大的结构,然后在它的不同部分进行处理。我写的是:
for ix_id = 1:length(ids)
for ix_years = 1:2:length(ids_years{ix_id})
for ix_frq = 1:length(frqs)
st = [ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq}'_P_1111.CSV'];
data.(ids_frqs{ix_id}{ix_frq}).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) =...
dlmread(st);
end
end
end
所有ids
变量都是1x4单元格数组,其中每个单元格都包含字符串。
这会产生错误: "错误:cs-list无法进一步编入索引" 和 "错误:在多次分配之外对cs-list进行无效分配"
我在互联网上搜索了这些错误,发现了一些日期介于2010年至2012年之间的帖子,例如this one和this one,其中作者认为这是Octave的一个问题本身。我可以通过删除ix_frq中最内层的for循环并替换以" st"开头的行来定义两个独立的结构。和"数据"与
data.1500.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) = ...
dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids{ix_id} '_1500MHZ_P_1111.CSV']);
data.2000.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}]) = ...
dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids{ix_id} '_2000MHZ_P_1111.CSV']);
因此,当我尝试创建一个更嵌套的结构时,似乎出现了麻烦。我想知道这对于Octave是否是独一无二的,或者在Matlab中是相同的,并且如果有一个比定义两个独立结构更光滑的解决方法,因为我希望它尽可能便携。如果您对错误消息的含义有任何见解,我也对此感兴趣。谢谢!
编辑:这是完整的脚本 - 现在生成一些虚拟的.csv文件。在Octave v.3.8上运行
clear all
%this program tests the creation of various structures. The end goal is to have a structure of the format frequency.beamname.year(1) = matrix of the appropriate file
A = rand(3,2);
csvwrite('2009103_C115_1500MHZ.CSV',A)
csvwrite('2009103_C115_2000MHZ.CSV',A)
csvwrite('2010087_C115_1500MHZ.CSV',A)
csvwrite('2010087_C115_2000MHZ.CSV',A)
csvwrite('2009103_C335_1500MHZ.CSV',A)
csvwrite('2009103_C335_2000MHZ.CSV',A)
csvwrite('2010087_C335_1500MHZ.CSV',A)
csvwrite('2010087_C335_2000MHZ.CSV',A)
data = dir('*.CSV'); %imports all of the files of a directory
files = {data.name}; %cell array of filenames
nfiles = numel(files);
%find all the years
years = unique(cellfun(@(x)x{1},regexp(files,'\d{7}','match'),'UniformOutput',false));
%find all the beam names
ids = unique(cellfun(@(x)x{1},regexp(files,'([C-I]\d{3})|([C-I]\d{1}[C-I]\d{2})','match'),'UniformOutput',false));
%find all the frequencies
frqs = unique(cellfun(@(x)x{1},regexp(files,'\d{4}MHZ','match'),'UniformOutput',false));
%now, vectorize to cover all the beams
for id_ix = 1:length(ids)
expression_yrs = ['(\d{7})(?=_' ids{id_ix} ')'];
listl_yrs = regexp(files,expression_yrs,'match');
ids_years{id_ix} = unique(cellfun(@(x)x{1},listl_yrs(cellfun(@(x)~isempty(x),listl_yrs)),'UniformOutput',false)); %returns the years for data collected with both the 1500 and 2000 MHZ antennas along each of thebeams
expression_frqs = ['(?<=' ids{id_ix} '_)(\d{4}MHZ)'];
listfrq = regexp(files,expression_frqs,'match'); %finds every frequency that was collected for C115, C335
ids_frqs{id_ix} = unique(cellfun(@(x)x{1},listfrq(cellfun(@(x)~isempty(x),listfrq)),'UniformOutput',false));
end
%% finally, dynamically generate a structure data.Beam.Year.Frequency
%this works
for ix_id = 1:length(ids)
for ix_year = 1:length(ids_years{ix_id})
data1500.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{1}{1} '.CSV']);
data2000.(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{1}{2} '.CSV']);
end
end
%this doesn't work
for ix_id=1:length(ids)
for ix_year=1:length(ids_years{ix_id})
for ix_frq = 1:numel(frqs)
data.(['F' ids_frqs{ix_id}{ix_frq}]).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq} '.CSV']);
end
end
end
希望这有助于澄清问题 - 我不确定此处发布编辑和代码的礼节。
答案 0 :(得分:1)
问题是,当你到达导致问题的for循环时,数据已经存在并且是一个struct数组。
octave> data
data =
8x1 struct array containing the fields:
name
date
bytes
isdir
datenum
statinfo
当您从struct数组中选择一个字段时,您将获得一个cs-list(逗号分隔列表),除非您还索引struct数组中的哪些结构。参见:
octave> data.name
ans = 2009103_C115_1500MHZ.CSV
ans = 2009103_C115_2000MHZ.CSV
ans = 2009103_C335_1500MHZ.CSV
ans = 2009103_C335_2000MHZ.CSV
ans = 2010087_C115_1500MHZ.CSV
ans = 2010087_C115_2000MHZ.CSV
ans = 2010087_C335_1500MHZ.CSV
ans = 2010087_C335_2000MHZ.CSV
octave> data(1).name
ans = 2009103_C115_1500MHZ.CSV
所以当你这样做时:
data.(...) = dlmread (...);
你没有得到你想要的左侧,你会得到一个cs列表。但我猜这是偶然的,因为data
目前只有文件名,所以只需创建一个新的空结构:
data = struct (); # this will clear your previous data
for ix_id=1:length(ids)
for ix_year=1:length(ids_years{ix_id})
for ix_frq = 1:numel(frqs)
data.(['F' ids_frqs{ix_id}{ix_frq}]).(ids{ix_id}).(['Y' ids_years{ix_id}{ix_year}])=dlmread([ids_years{ix_id}{ix_year} '_' ids{ix_id} '_' ids_frqs{ix_id}{ix_frq} '.CSV']);
end
end
end
我还建议您更好地考虑当前的解决方案。这段代码看起来过于复杂。