Question

我有一个来自omnet的.sca文件（可以作为文本文件读取）。我需要在某个参数后提取一些数字。例如：

scalar SendIntBitRate.host1.udpApp[0]   "packets sent"  1041
scalar SendIntBitRate.host1.udpApp[0]   "packets received"  0
scalar SendIntBitRate.host1.udpApp[0]   sentPk:count    1041
attr interpolationmode  none
attr unit  bps
scalar SendIntBitRate.host2.udpApp[0]   rcvdPk:count    93
attr interpolationmode  none
attr source  rcvdPk

现在我需要在sentPk：count和rcvdPk：count之前和之后读取关键字/数字（例如这里SendIntBitRate.host1.udpApp [0]，1041）并将其写入csv文件。请注意，多行包含sentPk：count和rcvdPk：count关键字。我将整个文件存储在单元格C中：

fid = fopen('1Mbps1000us1250B\BR1MBPS1MS-0.sca','r')
C = textscan(fid, '%s','Delimiter','');
fclose(fid)
C = C{:};

其中包含scalar SendIntBitRate.host1.udpApp[0] sentPk:count 1041

等行

但现在在这些行中，如何在关键字之前和之后提取文本。

Answer 1

如果您已经在字符串中包含每一行，请说

str = 'scalar SendIntBitRate.host1.udpApp[0]   sentPk:count    1041';

使用lookaround的正则表达式轻松提取'sentPk:count'之前和之后的部分（分隔空格）：

result_before = regexp(str, '\S+(?=\s+sentPk:count)', 'match');
result_before = result_before{1};
result_after = regexp(str, '(?<=sentPk:count\s+)\S+', 'match');
result_after = result_after{1};

在示例中，这将生成字符串

result_before =
SendIntBitRate.host1.udpApp[0]

result_after =
1041

Answer 2

我按空间拆分并执行传统搜索：

inp = 'scalar SendIntBitRate.host1.udpApp[0]   sentPk:count    1041';
s=regexp(inp, '\s+','split');

idx = find(strcmp(s, 'sentPk:count'));
if length(idx) == 1
   before = s{idx - 1};
   after = s{idx + 1};
end

Regexp-oneliner也是可能的，但我不认为它更清楚：

a=regexp(inp, '(\S+)\s+sentPk:count\s+(\d+)', 'tokens')

在matlab中提取关键字之前和之后的字符串

2 个答案: