从Matlab中的数据文件中的文本中提取数字数据

时间:2017-11-09 13:41:00

标签: matlab text import cell cell-array

我有一个.txt数据文件,开头有几行文本注释,后跟实际数据列。它看起来像这样:

lens (mm): 150
Power (uW): 24.4
Inner circle: 56x56
Outer Square: 256x320
remarks: this run looks good            
2.450000E+1 6.802972E+7 1.086084E+6 1.055582E-5 1.012060E+0 1.036552E+0
2.400000E+1 6.866599E+7 1.088730E+6 1.055617E-5 1.021491E+0 1.039043E+0
2.350000E+1 6.858724E+7 1.086425E+6 1.055993E-5 1.019957E+0 1.036474E+0
2.300000E+1 6.848760E+7 1.084434E+6 1.056495E-5 1.017992E+0 1.034084E+0

通过使用importdata,Matlab会自动分离文本数据和实际数据。但是如何从文本中提取这些数字数据(以单元格格式存储)?我想要做的是:

  1. 提取这些数字(例如150,24.4)
  2. 如果可能,请提取姓名('镜头',' Power')
  3. 如果可能,提取单位(' mm',' uW')
  4. 1是最重要的,2或3是可选的。如果这简化了代码,我也很乐意更改文本注释的格式。

1 个答案:

答案 0 :(得分:1)

假设您的示例数据保存为demo.txt,您可以执行以下操作:

function q47203382
%% Reading from file:
COMMENT_ROWS = 5;
% Read info rows:
fid = fopen('demo.txt','r'); % open for reading
txt = textscan(fid,'%s',COMMENT_ROWS,'delimiter', '\n'); txt = txt{1};
fclose(fid);
% Read data rows:
numData = dlmread('demo.txt',' ',COMMENT_ROWS,0);
%% Processing:
desc = cell(5,1);
unit = cell(2,1);
quant = cell(5,1);
for ind1 = 1:numel(txt)
  if ind1 <= 2
    [desc{ind1}, unit{ind1}, quant{ind1}] = readWithUnit(txt{ind1});
  else
    [desc{ind1},             quant{ind1}] = readWOUnit(txt{ind1});
  end
end
%% Display:
disp(desc);
disp(unit);
disp(quant);
disp(mat2str(numData));
end

function [desc, unit, quant] = readWithUnit(str)
  tmp = strsplit(str,{' ','(',')',':'});
  [desc, unit, quant] = tmp{:};
end

function [desc, quant] = readWOUnit(str)
  tmp = strtrim(strsplit(str,': '));   
  [desc, quant] = tmp{:};
end

我们分两个阶段阅读数据:textscan表示开头的评论行,dlmread表示以下数字数据。然后,这是分割文本以获取各种信息的问题。

以上是上述的输出:

>> q47203382
    'lens'
    'Power'
    'Inner circle'
    'Outer Square'
    'remarks'

    'mm'
    'uW'

    '150'
    '24.4'
    '56x56'
    '256x320'
    'this run looks good'

    [24.5 68029720 1086084 1.055582e-05 1.01206  1.036552;
     24   68665990 1088730 1.055617e-05 1.021491 1.039043;
     23.5 68587240 1086425 1.055993e-05 1.019957 1.036474;
     23   68487600 1084434 1.056495e-05 1.017992 1.034084]

(我冒昧地将输出格式化一点以便于查看。)

另请参阅:str2double