如何添加/删除文件中的数字或数字范围并重新组织范围
例如在文件
中$ cat test.in
cn[01-10]
cn01
cn[01,02,07-09]
cn[01-02]
要求删除cn01和cn05
期望的输出
$ cat test.in
cn[02-04,06-10]
cn[02,07-09]
cn[02]
答案 0 :(得分:0)
以下是将列表和值范围扩展为单个值的方法:
$ cat tst.awk
function expand(exprStr,valsArr, i,terms,term,range,val,numVals) {
gsub(/cn|[][]/,"",exprStr)
delete valsArr
# exprStr = 01,02,07-09
split(exprStr,terms,/,/)
for (i=1; i in terms; i++) {
# terms[1]=01, [2]=02, [3]=07-09
term = terms[i]
split(term,range,/-/)
range[2] = (2 in range ? range[2] : range[1])
for (val=range[1]; val<=range[2]; val++) {
# range[1]=07, [2]=09
valsArr[++numVals] = sprintf("%02d",val)
}
}
}
{
print "--------", $0
expand($0,arr)
for (i=1; i<=length(arr); i++) {
print i, "cn"arr[i]
}
}
$ awk -f tst.awk file
-------- cn[01-10]
1 cn01
2 cn02
3 cn03
4 cn04
5 cn05
6 cn06
7 cn07
8 cn08
9 cn09
10 cn10
-------- cn01
1 cn01
-------- cn[01,02,07-09]
1 cn01
2 cn02
3 cn07
4 cn08
5 cn09
-------- cn[01-02]
1 cn01
2 cn02
现在只需从数组中删除不需要的值,然后反过来重新组合成输入格式。
答案 1 :(得分:0)
Python 3中的示例
import re
from itertools import groupby
inp = """cn[01-10]
cn01
cn[01,02,07-09]
cn[01-02]"""
rem = {1, 5}
def parse_lst(lst_str):
for group in lst_str.split(','):
if '-' in group:
first, last = group.split('-')
yield from range(int(first), int(last)+1)
else:
yield int(group)
def format_range(range_):
ranges = []
for k, g in groupby(enumerate(range_), lambda x: x[0]-x[1]):
group = [n for i, n in g]
ranges.append((group[0], group[-1]))
if not ranges:
return
print("cn[" + ','.join(
'{:02d}'.format(first) if first == last else
'{:02d}-{:02d}'.format(first, last) for
first, last in ranges
) + ']')
for line in inp.splitlines():
lst_match = re.search(r'\[(.*)\]', line)
if lst_match:
range_ = parse_lst(lst_match.group(1))
else:
range_ = (int(line[2:]),)
filtered = sorted(set(range_) - rem)
format_range(filtered)
打印
cn[02-04,06-10]
cn[02,07-09]
cn[02]