我有一个凌乱的日志文件,我想从中提取一些对我有用的信息。凌乱,我的意思是文件可以包含任意行字符/数字。但是,我需要提取的数字始终以某个字符串开头 - new beta value =
。
例如,如果我的输入日志文件是
3456789FGHJKLcvbnm,.ghjkl
Error!
Warning. GHJKL:6&*()_
new beta value = 1557.01
$%^&*()VGBNM<
GBHNM<
Warning!!!
This is a random line
new beta value = 1101.6
TL:vbNM<>%^UIOP
FGHJKL]\[;/
new beta value = 100
...
我希望阅读
1557.01
1101.6
100
...
进入MATLAB。
似乎MATLAB没有内置函数。我怎么能实现这个目标?
答案 0 :(得分:4)
正如@excaza建议的那样,有多种方法可以做到这一点。我发现读取整个文件并使用regexp更容易+更快。
indata = fileread('test.txt');
pattern = 'new beta value =\s+(\d+.\d+)'; %//the pattern you are looking for is a Stirng "new beta value =" followed by a Double (which is the integer part of the number you are looking for) + a dot(or decimal) + another Double (which is the part 2 of the number you are looking for)
lines = regexp(indata, pattern, 'tokens'); %//output as cell array
result = cell2mat(cellfun(@(x) str2double(x{:}), lines, 'UniformOutput', false)); %//output as Matrix
result =
1557.01 1101.6 100
答案 1 :(得分:4)
利用fgetl
queryline = 'new beta value';
fID = fopen('test.txt');
mydata = []; % Initialize data
while ~feof(fID) % Loop until we get to the end of the file
tline = fgetl(fID);
if ~isempty(strfind(tline, queryline))
% If we find a match for our query string in the line of the file
formatspec = sprintf('%s = %%f', queryline)
mydata = [mydata sscanf(tline, formatspec)];
end
end
fclose(fID);
答案 2 :(得分:3)
这是另一个实现:
fid = fopen('file.txt', 'r');
str = reshape(fread(fid,inf,'*char'),1,[]);
fclose(fid);
numbers = str2double(regexp(str, '(?<=new beta value =\s+)\d+(.\d*)?','match')).';
其工作原理如下:
str2double
被应用于转换为数字向量。假设格式:
\d+(.\d*)?
会检测表单100.34
或100
的数字。它未检测到-100
,-100.34
,.34
,-.34
。如果您也想要这些案例,则需要相应地修改正则表达式。\s+
。