Question

我有一个关于在matlab中读取txt文件的问题是格式不知道，但txt文件中的每一行总是这样开始：

2012-11-01 00:00:00.00 XX YY  00.000s

然后记录一些不同的东西，txt文件看起来可能不同，例如

Ex1:    2012-11-01 00:00:00.00 XX YY  00.000s  000.00deg  0.00rpm  0.00rpm
Ex2:    2012-11-01 00:00:00.00 XX YY  00.000s  000.00deg  0.00rpm   
Ex3:    2012-11-01 00:00:00.00 XX YY  00.000s  0.00deg 0.00rpm 0.00rpm 0.0deg      
Ex4:    2012-11-01 00:00:00.00 XX YY  00.000s  0.00rpm

我使用textscan处理此问题并使用：

Fid = fopen('text.txt');
initfrm = {'%s%s%s%s %.3f %s'};
frm = repmat('%.2f %s',1,NCol);
frm = strcat(initfrm, frm);
Tmp = textscan(fid,frm{1});
Fclose(fid);

在文件中，它计算了我们记录了多少col（NCol），但未在此处显示

但有时文本文件包含0.0%，例如：

Ex1:    2012-11-01 00:00:00.00 XX YY  00.000s 000.00deg   0.00rpm  0.00rpm  0.0%

现在'%.2f'无效。我不知道日志是什么时候的。有没有更好的方法来分开浮动和字符串打印在一起;我只想收集数据（浮动），以便我可以绘制。

当它随％.2f和％.1f变化时，如何获得所有浮点值;你不知道这种模式。

Answer 1

像这样导入文本可能是一个真正的痛苦;通常，这是对你的字符串操作知识的一个很好的测试:)

我相信以下命令可以很好地完成：

% Read in entire file as string
fid = fopen('yourFile.txt');
    C = textscan(fid, '%s', 'delimiter', '');
fclose(fid);
C = C{1};

% Remove first part (from column 39 onwards in your example; 
% adjust to match your actual data)
C = cellfun(@(x)x(39:end), C, 'UniformOutput',false);

% Remove unwanted junk
% NOTE: this removes all occurrences of 'rpm', 'deg', 
% 's', and the trailing '0.0%'
C = regexprep(C, {'deg' 'rpm' 's' '([0-9]+\.[0-9]+%)$'}, '');

% Tokenize string and convert to double
C = cellfun(@(x)textscan(x, '%f'), C);

我用yourFile.txt测试了这个：

Ex1:    2012-11-01 00:00:00.00 XX YY  00.000s  000.00deg  0.00rpm  0.00rpm
Ex2:    2012-11-01 00:00:00.00 XX YY  00.000s  000.00deg  0.00rpm   
Ex3:    2012-11-01 00:00:00.00 XX YY  00.000s  0.00deg    0.00rpm  0.00rpm 0.0deg      
Ex3:    2012-11-01 00:00:00.00 XX YY  00.000s  0.00deg    0.00rpm  0.00rpm 0.0deg    0.0%
Ex4:    2012-11-01 00:00:00.00 XX YY  00.000s  0.00rpm
Ex4:    2012-11-01 00:00:00.00 XX YY  00.000s  0.00rpm

使用上述命令的C的最终内容是

Answer 2

我不确定我是否正确解释了你的问题。在我看来，你在每行文本中都有可变数量的标记，N或N + 1（N + m，可能是？）。

如果是这样，我会建议一种基于从每一行提取令牌的方法。

考虑一下：

您使用fgets 从文件中提取每一行;
您使用strtok 迭代分隔令牌（即 tokenize 您的字符串。您使用' '作为令牌分隔符）;
因为你有一个固定的初始模式，你可能想要重新合并前N个标记并像你已经那样解析它们。然后，您可能想要检查位置N + 1中的令牌是否存在并最终解析它。

如何用togetter写的字符串分隔浮点数 - matlab

2 个答案: