在matlab中提取关键字之前和之后的字符串

时间:2015-06-29 11:11:29

标签: regex matlab

我有一个来自omnet的.sca文件(可以作为文本文件读取)。我需要在某个参数后提取一些数字。例如:

scalar SendIntBitRate.host1.udpApp[0]   "packets sent"  1041
scalar SendIntBitRate.host1.udpApp[0]   "packets received"  0
scalar SendIntBitRate.host1.udpApp[0]   sentPk:count    1041
attr interpolationmode  none
attr unit  bps
scalar SendIntBitRate.host2.udpApp[0]   rcvdPk:count    93
attr interpolationmode  none
attr source  rcvdPk

现在我需要在sentPk:count和rcvdPk:count之前和之后读取关键字/数字(例如这里SendIntBitRate.host1.udpApp [0],1041)并将其写入csv文件。请注意,多行包含sentPk:count和rcvdPk:count关键字。我将整个文件存储在单元格C中:

fid = fopen('1Mbps1000us1250B\BR1MBPS1MS-0.sca','r')
C = textscan(fid, '%s','Delimiter','');
fclose(fid)
C = C{:};

其中包含scalar SendIntBitRate.host1.udpApp[0] sentPk:count 1041

等行

但现在在这些行中,如何在关键字之前和之后提取文本。

2 个答案:

答案 0 :(得分:1)

如果您已经在字符串中包含每一行,请说

str = 'scalar SendIntBitRate.host1.udpApp[0]   sentPk:count    1041';

使用lookaround的正则表达式轻松提取'sentPk:count'之前和之后的部分(分隔空格):

result_before = regexp(str, '\S+(?=\s+sentPk:count)', 'match');
result_before = result_before{1};
result_after = regexp(str, '(?<=sentPk:count\s+)\S+', 'match');
result_after = result_after{1};

在示例中,这将生成字符串

result_before =
SendIntBitRate.host1.udpApp[0]

result_after =
1041

答案 1 :(得分:1)

我按空间拆分并执行传统搜索:

inp = 'scalar SendIntBitRate.host1.udpApp[0]   sentPk:count    1041';
s=regexp(inp, '\s+','split');

idx = find(strcmp(s, 'sentPk:count'));
if length(idx) == 1
   before = s{idx - 1};
   after = s{idx + 1};
end

Regexp-oneliner也是可能的,但我不认为它更清楚:

a=regexp(inp, '(\S+)\s+sentPk:count\s+(\d+)', 'tokens')