Question

主要目标是使用bash在数组中查找周期序列，例如：

{2,5,7,8,2,6,5,3,5,4,5,5,7,8,2,6,5,3,5,4,2,5,7， 8,2,6,5,3,5,4} 或{2,5,6,3,4,2,5,6,3,4,2,5,6,3,4,2,5,6,3,4}

必须作为两个例子的识别序列返回 {2,5,7,8,2,6,5,3,5,4}和{2,5,6,3,4}

我尝试了一个列表和一个由两个数组组成的子列表但没有成功。我必须在循环中遗漏一些东西。我认为“乌龟和兔子”算法是一种替代方案，但我错过了bash命令中的一些知识来实现它。

我更喜欢用乌龟和野兔发布第二次尝试，因为第一次似乎是无用的尝试：

#!/bin/bash
declare -A array=( 1, 2, 3, 1, 2, 3, 1, 2, 3 )
declare -A found=()
loop="notfound"
tortoise=`echo ${array[0]}`
hare=`echo ${array[0]}`
found[0]=`echo ${array[0]}`
while ( $loop == "notfound" )
do
    for ((i=1;i=`echo ${#array[@]}`;i++))
    do
        if (( `echo ${array[$#]}` == $hare ))
        then
            echo "no loop found"
            exit 0
        fi
        hare=`echo ${array[$i]}`
        if (( `echo ${array[$#]}` == $hare ))
        then
            echo "no loop found"
            exit 0
        fi
        hare=`echo ${array[$(($i+1))]}`
        tortoise=`echo  ${array[$i]}`
        found[$i]=`echo  ${array[$i]}`
        if (( $hare == $tortoise ))
        then
            loop="found"
            printf "$found[@]}"
        fi
    done
done

我在需要indice的关联数组中遇到错误

Answer 1

给定一个小数位数a

a=(2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4)

然后使用正则表达式backsubstitution，例如在perl

中

printf '%d' "${a[@]}" | perl -lne 'print $1 if /^(\d+)\1+/'
2578265354

使用不完整的序列进行测试

a=(1 2 3 1 2 3 1 2)
printf '%d' "${a[@]}" | perl -lne 'print $1 if /^(\d+)\1+/'
123

如果您只想要完整重复，请向RE添加$行锚，/^(\d+)\1+$/

现在，如果你想确定“最接近”重复的最长的子序列，那就有点棘手了。例如，对于250位数的序列，是一个118位的子序列，重复2次（剩下16个字符），而您的预期输出是13位子序列（重复19次，剩余3位数）。所以你想要一个“贪婪但不太贪心”的算法。

一种（希望效率不是很低）的方法是连续删除尾随数字，直到获得锚定匹配，即对于某些子序列，整个剩余序列s*可以表示为n x t { {1}}。在perl中，我们可以将它写成一个简单的循环

使用250位数序列进行测试：

perl -lne 'while (! s/^(\d+)\1+$/$1/) {chop $_}; print'

然后

a=( 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 )

注意：如果在找到匹配项之前字符串已用尽，则无法终止;如果这是可能的话，你需要测试它并突破printf '%d' "${a[@]}" | perl -lne 'while (! s/^(\d+)\1+$/$1/) {chop $_}; print' 1102120020222循环。

Answer 2

我仅使用您提供的输入对此进行了测试。假设 - 匹配的模式总是从数组的开头开始，然后在那里重复。

#!/bin/bash

#arr=(2 5 7 8 2 6 5 3 5 4  2  5  7  8  2  6  5  3  5  4  2  5  7  8  2  6  5  3  5  4)
arr=(2  5  6  3  4  2  5  6  3  4  2  5  6  3  4  2  5  6  3  4)

echo ${arr[@]}
n=${#arr[*]}
match=0
in_pattern=false

print_array()
{
  local first=$1 
  local last=$2
  local i

  for ((i=first; i<=last; i++));do
    printf "%d " ${arr[i]}
  done
  printf "\n"
}

i=0
start=0
end=0
j=$((i+1))

while (( j < n )); do
  #echo "arr[$i] ${arr[i]}  arr[$j] ${arr[j]}"
  if [[ ${arr[i]} -ne ${arr[j]} ]];then
    if [[ $match -ge 1 ]];then 
      echo "arr[$i] != arr[$j]"
      echo "pattern doesnt repeat after match # $match"
      exit 1
    fi
    ((j++))
    i=0
    in_pattern=false
    continue
  fi
  if $in_pattern ; then 
    if [[ $i -eq $end ]];then
      ((match++))
      end_match=$j
      echo "match # $match matched from $start -> $end and $start_match -> $end_match"  
      print_array $start $end
      print_array $start_match $end_match
      ((j++))
      i=0
      in_pattern=false
      continue
    fi
  else
    if [[ $match -eq 0 ]];then
      end=$((j-1))
    fi
    start_match=$j 
    in_pattern=true 
    #echo "trying to match from start $start end $end to start_match $start_match" 
  fi
  ((i++))
  ((j++))
done


output with first array -

./sequence.sh 
2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4 2 5 7 8 2 6 5 3 5 4
match # 1 matched from 0 -> 9 and 10 -> 19
2 5 7 8 2 6 5 3 5 4 
2 5 7 8 2 6 5 3 5 4 
match # 2 matched from 0 -> 9 and 20 -> 29
2 5 7 8 2 6 5 3 5 4 

2nd array -

/sequence.sh 
2 5 6 3 4 2 5 6 3 4 2 5 6 3 4 2 5 6 3 4
match # 1 matched from 0 -> 4 and 5 -> 9
2 5 6 3 4 
2 5 6 3 4 
match # 2 matched from 0 -> 4 and 10 -> 14
2 5 6 3 4 
2 5 6 3 4 
match # 3 matched from 0 -> 4 and 15 -> 19
2 5 6 3 4 
2 5 6 3 4

如何识别整数数组中的周期序列

2 个答案: