如何从MATLAB中的.txt文件中读取特定信息?

时间:2016-09-28 12:58:19

标签: matlab parsing caffe matcaffe

我有一个很长的文本文件:

I0927 11:33:18.534551 16932 solver.cpp:244]     Train net output #0: loss = 2.61145 (* 1 = 2.61145 loss)
I0927 11:33:18.534620 16932 sgd_solver.cpp:106] Iteration 20, lr = 0.001
I0927 11:33:33.221546 16932 solver.cpp:228] Iteration 40, loss = 0.573027
I0927 11:33:33.221771 16932 solver.cpp:244]     Train net output #0: loss = 0.573027 (* 1 = 0.573027 loss)
I0927 11:33:33.221851 16932 sgd_solver.cpp:106] Iteration 40, lr = 0.001
I0927 11:33:47.883162 16932 solver.cpp:228] Iteration 60, loss = 0.852016
I0927 11:33:47.884717 16932 solver.cpp:244]     Train net output #0: loss = 0.852016 (* 1 = 0.852016 loss)
I0927 11:33:47.884812 16932 sgd_solver.cpp:106] Iteration 60, lr = 0.001
I0927 11:34:02.543320 16932 solver.cpp:228] Iteration 80, loss = 0.385975
I0927 11:34:02.543442 16932 solver.cpp:244]     Train net output #0: loss = 0.385975 (* 1 = 0.385975 loss)
I0927 11:34:02.543514 16932 sgd_solver.cpp:106] Iteration 80, lr = 0.001
I0927 11:34:17.297544 16932 solver.cpp:228] Iteration 100, loss = 0.526758
I0927 11:34:17.297659 16932 solver.cpp:244]     Train net output #0: loss = 0.526758 (* 1 = 0.526758 loss)
I0927 11:34:17.297722 16932 sgd_solver.cpp:106] Iteration 100, lr = 0.001
I0927 11:34:31.962934 16932 solver.cpp:228] Iteration 120, loss = 0.792767

我想提取以下信息

[ Iteration, Train net output, lr ]

并将它们放在MATLAB的单元格中。

你可以指导我如何做到这一点吗?

2 个答案:

答案 0 :(得分:1)

我正在删除日志的前两行和最后一行,以使其保持一致,以便在每次迭代后都有Train net outputsgd_solver .. lr =行,如下所示:

I0927 11:33:33.221546 16932 solver.cpp:228] Iteration 40, loss = 0.573027
I0927 11:33:33.221771 16932 solver.cpp:244]     Train net output #0: loss = 0.573027 (* 1 = 0.573027 loss)
I0927 11:33:33.221851 16932 sgd_solver.cpp:106] Iteration 40, lr = 0.001
I0927 11:33:47.883162 16932 solver.cpp:228] Iteration 60, loss = 0.852016
I0927 11:33:47.884717 16932 solver.cpp:244]     Train net output #0: loss = 0.852016 (* 1 = 0.852016 loss)
I0927 11:33:47.884812 16932 sgd_solver.cpp:106] Iteration 60, lr = 0.001
I0927 11:34:02.543320 16932 solver.cpp:228] Iteration 80, loss = 0.385975
I0927 11:34:02.543442 16932 solver.cpp:244]     Train net output #0: loss = 0.385975 (* 1 = 0.385975 loss)
I0927 11:34:02.543514 16932 sgd_solver.cpp:106] Iteration 80, lr = 0.001
I0927 11:34:17.297544 16932 solver.cpp:228] Iteration 100, loss = 0.526758
I0927 11:34:17.297659 16932 solver.cpp:244]     Train net output #0: loss = 0.526758 (* 1 = 0.526758 loss)
I0927 11:34:17.297722 16932 sgd_solver.cpp:106] Iteration 100, lr = 0.001

您可以使用fileread将此文件作为文本阅读,然后使用以下代码执行regexp

txt = fileread('log.txt');
it = regexp(txt,'I0927.*solver.cpp:228]\sIteration\s(.*),.*','tokens','dotexceptnewline')

it =

  1×4 cell array

    {1×1 cell}    {1×1 cell}    {1×1 cell}    {1×1 cell}

net_out = regexp(txt,'I0927.*solver.cpp:244]\s*Train\snet\soutput.*loss\s=\s(\S*).*','tokens','dotexceptnewline');
lr = regexp(txt,'I0927.*sgd_solver.cpp:106]\sIteration.*lr\s=\s(\S*)','tokens','dotexceptnewline');

在将输出转换为数字之前,输出需要一些调节:

% Get outputs out of their cells
it = [it{:}]'; 
net_out = [net_out{:}]';
lr = [lr{:}]';

sim_out = str2double([it net_out lr]);

答案 1 :(得分:0)

根据Some Guy的建议,您可以使用regexp

fid = fopen('log.txt','r');
output = {};
line = fgetl(fid);
while ischar(line)
    l1 = regexp(line, 'Iteration\s+(\d+),\s+loss\s+=\s+', 'tokens', 'once');
    if ~isempty(l1)
        %// we got the first line of an iteration
        line = fgetl(fid);
        l2 = regexp(line, 'Train net output #0: loss = (\S+)', 'tokens', 'once');
        line = fgetl(fid);
        l3 = regexp(line, 'Iteration \d+, lr = (\S+)', 'tokens', 'once');
        output{end+1} = [str2double(l1{1}), str2double(l2{1}), str2double(l3{1})];
    end
    line = fgetl(fid);
end;
fclose(fid);
output = vertcat(output{:});
顺便说一句,你知道caffe的$CAFFE_ROOT/tools/extra/parse_log.py效用吗?