我有五个不同的.csv文件需要从中提取数据,在我从每个文件中提取数据后,我想将这些数据放入表中。
我的.csv目录:
2015.csv
2014.csv
2013.csv
2012.csv
2011.csv
我这样做的尝试是:
csvfiles = dir('.../*.csv');
dataArray = {}
table = table(dataArray{1:end-1}, 'VariableNames', {'WEEK','2015','2014', '2013', '2012', '2011'});
for file = csvfiles
delimiter = ',';
startRow = 2;
formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(file,'r'); %
dataArray = {dataArray, textscan(fileID, formatSpec, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false)};
fclose(fileID);
extracting_data = file.column1 + file.column3
end
但是,不仅fileID采用了无效的参数file
,而且我不确定如何提取数据并将其存储在表中。我可以使用file(1).name
使fileID有效,但textscan()
会引发错误Invalid file identifier. Use fopen to
generate a valid file identifier.
。
基本上,我的目标是:
1. Open each file in the directory.
2. Extract all necessary data from the known columns
3. Put all 52 values inside that file into their own column (one column per file).
编辑1:
这是我的代码的更新。我将dataArray
的数据结构更改为矩阵。
csvfiles = dir('/.../Data/*.csv');
data_matrix = zeros(52, 5); % Create empty matrix and format the matrix like the table.
iter = 0;
for file = 1:numel(csvfiles)
iter = iter + 1;
delimiter = ',';
startRow = 2;
formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]';
fileID = fopen(csvfiles(file).name,'r');
data = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false);
fclose(fileID);
extracted_data = file.column1 + file.column2; % I make sure to use the column headers.
% Add all 52 values from the extracted data to a single column.
data_matrix[:,iter] = influenza_a;
end
%% Create output table.
Influenza = table(data_matrix{1:end-1}, 'VariableNames',{'WEEK','2015','2014','2013','2012','2011'});
答案 0 :(得分:1)
我已经解释了评论%1)
到%5)
中对代码所做的更改,现在一切都应该匹配。
%1) The path is needed again later, so its put in a seperate variable. Also
%use only two dots here.
directory='/Users/user/folder/folder/MATLAB/folder/Data';
csvfiles = dir(fullfile(directory,'*.csv'));
%2) numel(csvfiles) to avoid unnecessary constants.
data_matrix = zeros(52, numel(csvfiles)); % Create empty matrix and format the matrix like the table.
iter = 0;
for file = 1:numel(csvfiles)
iter = iter + 1;
delimiter = ',';
startRow = 2;
formatSpec = '%f%f%f%f%f%f%f%f%f%f%f%[^\n\r]';
%3) Use absolute path here, otherwise file is not found
filename= fullfile(directory,csvfiles(file).name);
fileID = fopen(filename,'r');
%4) Inserted error check
if fileID<0
error('failed to open file %s',filename)
end
data = textscan(fileID, formatSpec, 'Delimiter', delimiter, 'HeaderLines' ,startRow-1, 'ReturnOnError', false);
fclose(fileID);
extracted_data = file.column1 + file.column2; % I make sure to use the column headers.
% Add all 52 values from the extracted data to a single column.
%5) Indexing was wrong, should be right this way:
data_matrix(:,file) = influenza_a;
end
%% Create output table.
Influenza = table(data_matrix{1:end-1}, 'VariableNames',{'WEEK','2015','2014','2013','2012','2011'});