将大文本文件读入MATLAB

时间:2015-04-28 18:10:51

标签: matlab import text-files

我正在尝试编写一个函数来读取多个(1000+)文本文件('.txt')到MATLAB中。下面显示了一个文件的snippit。实际文件具有相同的列,但行数约为150 000。

Start, Serial, DeviceId, RunNumber, Date, Real, Elapsed, X, EcgVal, EcgStatus, CapnoVal, CapnoStatus, P1Val, P1Status, P2Val, P2Status, P3Val, P3Status, Spo2Val, Spo2Status, CprDepth, CprFrequency, CprStatus, CprWaveVal, FiltEcgVal, FiltEcgStatus, Ecg2Val, Ecg2Status, Ecg3Val, Ecg3Status, Ecg4Val, Ecg4Status
2013-01-01 23:51:12, 00017711, TEMS ACP272, , 01-01-2013, 23:51:12.000, 00:00:00.000, 41275.993889, 0.000000, -1, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0, 0.000000, 0.000000, 1, 0.000000, 1, 0.000000, 1, 0.000000, 1
2013-01-01 23:51:12, 00017711, TEMS ACP272, , 01-01-2013, 23:51:12.008, 00:00:00.008, 41275.993889, 0.000000, -1, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0, 0.000000, 0.000000, 1, 0.000000, 1, 0.000000, 1, 0.000000, 1
2013-01-01 23:51:12, 00017711, TEMS ACP272, , 01-01-2013, 23:51:12.016, 00:00:00.016, 41275.993889, 0.000000, -1, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0, 0.000000, 0.000000, 1, 0.000000, 1, 0.000000, 1, 0.000000, 1
2013-01-01 23:51:12, 00017711, TEMS ACP272, , 01-01-2013, 23:51:12.024, 00:00:00.024, 41275.993889, 0.000000, -1, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0.000000, 0, 0, 0.000000, 0.000000, 1, 0.000000, 1, 0.000000, 1, 0.000000, 1

我尝试过显而易见的方法(csvread,dlmread,importdata)但没有成功。当我使用'ImportData'函数打开这个文件时,我得到:

þS

后面跟着5个空白行。使用

fid = fopen('TEST.txt','r');
fgetl(fid)

我发现每个数据行之间都有一个空行,并且每个字符之间都有一个空格。

我也尝试过使用textscan功能,如下所示

fid = fopen('TEST.txt','r');
c = textscan(fid, '%s', 'Delimiter', ',')

但这会返回一个空单元格。

可行的替代方法是在Excel中打开文件并将其另存为CSV文件。但是,鉴于我试图为1000多个文件执行此操作,这是不可行的。

非常感谢任何评论,建议或建议。谢谢!

更新:

以下似乎有效:

data = textscanu('TEST.txt');
str=textscan(data{1},'%s','Delimiter',',')

我会尝试将其写成一般来阅读整个文件,跳过空行并组织所有列。

1 个答案:

答案 0 :(得分:0)

方法#1:使用importdata -

%// Import text data as string cells, assuming file1 is the path to text file
data = importdata(file1,'')

%// Split columns based on the delimiter: ' '
split_data = cellfun(@(x) strsplit(x,' ') , data(2:end),'Uni',0)

%// Gather data into a N x number_of_entries cell array
out_data = vertcat(split_data{:})

%// Remove the commas after each entry (if so desired)
out_data = cellfun(@(x) strrep(x,',','') , out_data,'Uni',0)

%// Remove the sixth columns that had extra commas
out_data(:,6) = []

方法#2:使用textscan -

%// Read entire text data into a cell of a cell array, 
%// assuming file1 is the path to text file
fileID = fopen(file1,'r');
onecell_data = textscan(fileID,'%s','Delimiter','\n','HeaderLines',1);
fclose(fileID);

%// Unpack one level of data to have N x 1 sized cell array
data = [onecell_data{:}]

%// Split columns based on the delimiter: ' '
split_data = cellfun(@(x) strsplit(x,' ') , data(2:end),'Uni',0)

%// Gather data into a N x number_of_entries cell array
out_data = vertcat(split_data{:})

%// Remove the commas after each entry (if so desired)
out_data = cellfun(@(x) strrep(x,',','') , out_data,'Uni',0)

%// Remove the sixth columns that had extra commas
out_data(:,6) = []