我有30个以下格式的文件(数据是虚拟的,只看格式):
timestamp,id
1,a
2,b
3,a
1,a
5,c
6,b
3,a
从这30个文件中,我只想要所有文件中的所有时间戳,并将唯一的时间戳存储在一行中的一个文件中。 我在python中编写了相同的代码。但文件大小约为500 MB。 所以我想用matlab编写它。
答案 0 :(得分:1)
请参阅以下代码示例作为一种可能的解决方案。我还没有对它进行过测试,但是它应该简要介绍一下可以用来解决问题的函数和算法:
% save all your filenames in one struct
dataFiles = dir('yourFileNamesHere');
% if you know how many data lines you are going to read, you should
% do a preallocation of your data struct `s` here!
currentDataLine = 1;
% loop to read all your files
for i=1:length(dataFiles);
fp = fopen(datFiles(i).name);
% read the whole file content
while(~feof(fp))
% parse the data from one line
line = fgetl(fp);
% read the data line as two separate strings
tempData = textscan(line,'%s','delimiter',',');
% store the data
s(currentDataLine ).timestamp = tempData{1};
s(currentDataLine ).data = tempData{2};
currentDataLine = currentDataLine + 1;
end;
fclose(fp)
end;
% when all the data is read you can use the `unique`-function
% to delete entries with an identical timestamp.
% finally store your data in one file