如何在bash中找到x最常见的字符串及其出现次数?

时间:2017-03-11 13:35:05

标签: arrays bash

我想问一下如何在这种情况下找到5个最常出现的字符串及其出现次数。

我在bash脚本中有一个循环,在这个循环中有一个变量,每次迭代都会变为某个字符串。

我需要能够保存到一些变量(可能是数组?)5最常见的字符串和它们的出现次数(第二个数组?),以便稍后在脚本中使用它。 / p>

这是我尝试的代码..

$table->string('email', 250)->unique();

如果main for循环将有10次迭代,并且会有字符串

last=0 #index of the last string in the array

for i in ...
do

string=... #this is changed each iteration

placed=0 #checks whether the string has already benn placed
index=0

    while [ "$placed" -ne 1 ] #searches if the string is not places through the array ARRAY
    do
        if [ "$last" -eq "$index" ] ; then # this should place the string at the end if it is not in the arraz already
            ARRAY[index]="$string"
            OCCURENCE[index]=1
            (( index++ ))
            (( last++ ))
            break
        fi

        if [ "$string" == "$ARRAY[$index]" ] ; then 
                # here i  have another array with the occurences and increment the same index there
                (( OCCURENCE[index]++ ))
                placed=1
        fi

        (( index++ ))
    done

done

我想要包含字符串的数组

"hello 1"
"hello 2"
"hello 3"
"hello 1"
"hello 1"
"hello 2"
"hello 4"
"hello 5"
"hello 6"
"hello 2"

并且出现数组

"hello 1"
"hello 2"
"hello 3"
"hello 4"
"hello 5"
"hello 6"

2 个答案:

答案 0 :(得分:1)

简单地说:

#!/usr/bin/env bash

declare -A array

while read -r line
do
    (( array["$line"]++ ))
done<input_file

for i in "${!array[@]}"
do
    echo "$i has count of ${array[$i]}"
done

答案 1 :(得分:0)

我认为您想要解决的问题in this question

解决方案是使用sortuniq来获得所需的输出。

declare -a lines;
declare -a count;

while read -r line
do
    lines+=(${line});
done < <(echo $list | sort | uniq | tr '\n' ' ') #prints the sorted lines

while read -r line
do
    count+=(${line});
done < <(echo $list | sort | uniq --count | tr '\n' ' ') #prints the corresponding number of occurences

for ((i=0; i<${#lines[@]}; i=$i+1));
do
   echo "${lines[i]} ${count[i]}"
done | sort -k2 -n -r | head -n 5; # should sort along the second column, and cut the 5 first elements.