Question

我想在两个数组中找到部分匹配的ipv6前缀。例如，一个数组中的2001:db8:将匹配2001:db8:1::/48和2001:db8:2::/48。

我已经通过迭代另一个数组来实现它了：

ru_routes=( $(curl -4 ftp://ftp.ripe.net/ripe/stats/delegated-ripencc-latest | egrep -o '\|RU\|ipv6\|.+?::\|[0-9]+' | cut -d'|' -f4 | sed 's/::$/:/g') );
msk_ix_routes=( $(curl -4 http://www.msk-ix.ru/download/lg/msk_ipv6_pfx.txt.gz | gunzip | egrep -o '\b.*::/[0-9]*') );
routes=();
for item1 in ${msk_ix_routes[@]}; do
    for item2 in ${ru_routes[@]}; do
        if [[ $item1 = $item2* ]]; then
            routes+=( $item1 );
            break
        fi
    done
done

但它在我的mips路由器上运行有点慢（~90秒）。我找到了this useful answer，它的运行速度要快得多，但是我不能让它与上面那个一样工作。我不认为我需要＆＃34;如果＆＃34;例如，构造，因为它会做两次相同的事情。我没有工作的版本：

msk=" ${msk_ix_routes[*]} ";         # add framing blanks

for item in ${ru_routes[@]}; do
  routes+=( egrep -o "$item[\S]*/g" <<< $msk );
done

我猜这里引用和转义存在问题，但我无法解决。请帮助）我愿意接受建议。

是的，我用过＃34; comm＆＃34;在第一个版本中运行得更快，但它只是完全匹配，因此我开始玩循环：

routes=( $(comm -12 <(printf '%s\n' "${ru_routes[@]}" | LC_ALL=C sort) <(printf '%s\n' "${msk_ix_routes[@]}" | LC_ALL=C sort)) );

Answer 1

Bash脚本的效率根本不高。试试这个：

#!/bin/bash

# e. g.: ripencc|RU|ipv6|2001:640::|32|19991115|allocated -> ^2001:640:
awk -v FS='|' \
    '$2 == "RU" && $3 == "ipv6" { sub(/::/, ":", $4); print "^" $4 }' \
    <(curl -4 ftp://ftp.ripe.net/ripe/stats/delegated-ripencc-latest) \
|\
# grep e. g. '^2001:640:' in '2001:640:8000::/33'
grep --basic-regexp --file - \
    <(curl -4 http://www.msk-ix.ru/download/lg/msk_ipv6_pfx.txt.gz | gunzip)

用regex替换数组迭代

1 个答案: