Question

我正在处理大量富文本数据文件（.rtf）。文件中的数据由两列数字组成，这些数字使用类似于表的属性。此外，这些数字要么非常大，要么非常小，因此需要与这些数字相关的非常高的精度。

如何将第一列的数据分配给“A”，将第二列的数据分配给“B”？（这些是向量吗？）我现在的问题与富文本格式化的事实有关不合作导入MatLab并将.rtf文件转换为.txt（然后导入）将两列的数据合并为一列交替信息。

一旦我有“A”，我需要能够比较单个指定值并将其与第一列数据进行比较，找到最接近的值，然后在第二列中产生相应的值。

所以说我在我的文件中有这个数据样本：

1.0E-5      78.29777
1.0625E-5   75.9674
1.125E-5    73.83424
1.1875E-5   71.87197
1.25E-5     70.05895
1.375E-5    66.8116
1.5E-5      63.9797
1.625E-5    61.48167

我的单个指定值是1.123E-5，该值最接近1.125E-5，因此所需的输出为73.83424。

我怎么能这样做，我不知道从哪里开始，因为我不熟悉MatLab数据导入语法？

提前感谢您的帮助!!

Answer 1

您可以使用low level IO与regular expressions一起阅读*.rtf文件，无需任何转换即可获取数据。使用您的示例数据和*.rtf文件，我将一个笨重的解析器整合在一起，为您提供数据。如果您在文本编辑器中打开*.rtf文件，您会注意到（至少在我的文件编辑器中）它有2个标题行：

{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang1033{\fonttbl{\f0\fnil\fcharset0 Calibri;}}
{\*\generator Riched20 6.3.9600}\viewkind4\uc1

后面跟你的数据混合了更多的标题（可能只是一个wordpad失败）：

\pard\sa200\sl276\slmult1\f0\fs22\lang9 1.0E-5      78.29777\par

所以我们跳过前两行，以不同的方式处理第三行，然后处理其余的行：

fID = fopen('test.rtf', 'r'); % Open our data file

nheaders = 2; % Number of full header lines
npartialheaders = 1; % Number of header lines with your data mixed in

ii = 1;
mydata = [];
while ~feof(fID) % Loop until we reach the end of the file
    if ii <= nheaders
        % Do nothing
        tline = fgetl(fID); % Read in a line of data, discard it
        ii = ii + 1;
    else
        tline = fgetl(fID); % Read in a line of data
        out = regexp(tline, '([\s\d.E-])', 'match');

        if ~isempty(out) % Our regex found some data
            % The regexp returns every character in a cell, concatenate them
            % and split them along the spaces
            data_str = strsplit([out{:}], ' ');

            if ii > nheaders && ii <= (nheaders + npartialheaders)
                % Header is mixed with your data
                % We should only want the second and third matches
                data_num = str2double(data_str(2:3));
                mydata = [mydata; data_num];
            else
                % Just your data on these lines
                data_num = str2double(data_str(1:2));
                mydata = [mydata; data_num];
            end
        end

        ii = ii + 1;
    end
end

fclose(fID);

返回：

mydata =

    1.00000000000000e-05    78.2977700000000
    1.06250000000000e-05    75.9674000000000
    1.12500000000000e-05    73.8342400000000
    1.18750000000000e-05    71.8719700000000
    1.25000000000000e-05    70.0589500000000
    1.37500000000000e-05    66.8116000000000
    1.50000000000000e-05    63.9797000000000
    1.62500000000000e-05    61.4816700000000

不可否认，这是一个丑陋，低效的代码。我确信可以进行大量更改以使其更加强大和高效，但它应该有助于您入门。

现在您拥有自己的数据我认为您可以找到第二部分。如果您还没有，请查看MATLAB的matrix indexing documentation。作为一个实现的提示，请查看min的输出，并考虑从向量中减去常量可以做什么。

% What is this doing? It's a mystery! [~, matchidx] = min(abs(mydata(:,1) - querypoint)); disp(mydata(matchidx, 2))

Answer 2

以下是我要做的事情：将内容复制到excel或Google电子表格中，然后保存为.csv，从这里开始就很容易

T = readtable（'path / to / my / data.csv'）;

现在T包含您的数字作为表格数据类型的双浮点数。

A = T {：，1}; ％column 1

B = T {：，2}; ％第2列

祝你好运！

将.rtf数据导入MatLab

2 个答案: