Question

我需要在bash过程中进行数组搜索这么多次。我需要知道什么是最快速有效的方法。我知道该怎么做。问题的关键是如何以最快的方式做到这一点。现在，我这样做：

#!/bin/bash

array_test=("text1" "text2" "text3" "text4" "text5")
text_to_search="text4"

START=$(date +%s.%N)

for item in "${array_test[@]}"; do
    if [ ${item} = "${text_to_search}" ]; then
        echo "found!!"
        break
    fi
done

END=$(date +%s.%N)
DIFF=$(echo "$END - $START" | bc)
echo $DIFF

使用此代码，我们可以测量时间。

想象一下，我们在数组中有300个或更多项目。有更快的方法吗？我需要提高性能。谢谢。

编辑我使用的是bash 4.2。真正的数组有换行符：

array_test=(
"text1"
"text2"
"text3"
"text4"
"text5"
)

Answer 1

最快版本（适用于大型阵列）

在let userRating = response.userInfoDictionary?[RequestedUserConstant.userRatingKey].floatValue

中使用grep -qF

array[*]

假设：数组和搜索文本都不包含换行符。

只要存在可以存储在bash变量中的已知未使用字符，也可以使用换行符搜索文本。我发现if ( IFS=$'\n'; echo "${array[*]}" ) | grep -qFx "$text_to_search"; then echo "found!" fi（ASCII控制字符»记录分隔符«）运行得很好。改编后的版本如下：

\x1E

测试

我使用了一个大小为1'000'000的数组，其生成如下：

export d=$'\x1E' # unused character, here "Record Seperator"
if ( IFS="$d"; echo "$d${array[*]}$d" ) | grep -qF "$d$text_to_search$d"; then
    echo "found!"
fi

（顺便说一下：使用size=$((10 ** 6)) array_test=($(seq -f 'text%.0f' 1 "$size")) 幅度比{1..1000000}慢

搜索模式是所述数组的最后一个条目

seq

测试了三种搜索方法

您的方法

text_to_search="text$size"

grep -qF

array[@]

grep -qF

最快版本

得到以下结果：

65.5 秒
59.3 秒
00.4 秒（是的，这是小数点前的零点）

Answer 2

如果你非常关心性能，或许（a）Bash不是最好的工具，（b）你应该尝试不同的方法并对你的数据进行测试。话虽这么说，也许关联数组可以帮助你。

试试这个：

#!/bin/bash

declare -A array_test()
array_test=(["text1"]="" ["text2"]="" ["text3"]="" ["text4"]="" ["text5"]="")
text_to_search="text4"

if
  [[ ${array_test[$text_to_search]+found} ]]
then
  echo "Found!"
fi

请注意，在这种情况下，我使用键构建关联数组但空值（没有真正用于将值设置为与键相同的东西并占用更多内存）。

另一种方法是对数组进行排序，并使用某种二进制搜索。当然，这将涉及更多代码，如果有效地实现Bash关联数组，则可能会更慢。但是，再一次，没有什么比实际数据测试更能验证性能假设。

如果您的关联数组包含键中的信息，则可以使用扩展来使用它们，就像使用值一样：

for key in "${!array[@]}"
do
  do_something_with_the key
done

此外，您可以使用循环构建数组，如果从文件或命令输出中读取元素，这将非常有用。举个例子：

declare -A array_test=()
while IFS= read -r key
do
  array_test[$key]=
done < <(command_with_output)

值得注意的是，设置为null（空）值的元素与unset元素不同。您可以通过以下扩展轻松查看：

declare -A array_test=()
array_test[existing_key]=
echo ${array_test[existing_key]+found} # Echoes "found"
echo ${array_test[missing_key]+found}  # Echoes nothing

"${var+value}"扩展使用很有用，因为如果变量未设置则扩展为空，如果设置则设置为value。使用set -u捕获尝试扩展未设置变量时，它不会产生错误。

Answer 3

如果您的数组元素都不包含空格，则可以使用模式匹配

public class SendReciveSoapData
{
    private string[] UUID_array { get; set; }
    private string CurrentUUID { get; set; }

    public void InvokeSend(string[] uuid_array)
    {
        int len = uuid_array.Length;
        if (len > 0)
        {
            CurrentUUID = uuid_array[0].ToString();
            string strToRemove = CurrentUUID;
            UUID_array = uuid_array.Where(val => val != strToRemove).ToArray();
            invokeSend(CurrentUUID);
        }
    }
    private void invokeSend(string uuid)
    {
        CurrentUUID=uuid;
        WebRef.ResponderService RespService = new WebRef.ResponderService();
        RespService.SendDataAsync(uuid);
        RespService.SendCompleted += RespService_Send_Completed;
    }
    void RespService_Send_Completed(object sender, WebRef.CompletedEventArgs e)
    {
        //Saving Response Data to database
        string SuccessID = e.Result;
        string TransactionID = CurrentUUID;
        DataBase db = new DataBase();
        db.UpdateResponseID(SuccessID, TransactionID);
        InvokeSend(UUID_array);
    }
}

引号和空格非常重要。

在bash中，在array_test=("text1" "text2" "text3" "text4" "text5") text_to_search="text4" [[ " ${array_test[*]} " == *" $text_to_search "*]] && echo found || echo not found个双括号内，[[ ]]运算符是模式匹配运算符，而不仅仅是字符串相等。

Answer 4

基于使用关联数组的有用@Fred答案的具体示例：

<强> script.sh

#!/bin/bash

read -a array_test <<< $(seq 1 10000)
text_to_search="9999"

function from_associative_array {
  declare -A array
  for constant in ${array_test[@]}
  do
      array[$constant]=1
  done
  [[ ${array[$text_to_search]} ]] && echo  "Found in associative array";
}

function from_indexed_array {
  for item in "${array_test[@]}"; do
      if [ ${item} = "${text_to_search}" ]; then
          echo "Found in indexed array"
          break
      fi
  done
}

time from_indexed_array
time from_associative_array

$ bash script.sh
Found in indexed array

real    0m0.611s
user    0m0.604s
sys     0m0.004s

Found in associative array

real    0m0.297s
user    0m0.296s
sys     0m0.000s

击。最快速有效的阵列搜索

4 个答案:

最快版本（适用于大型阵列）

测试