我想问一下如何在这种情况下找到5个最常出现的字符串及其出现次数。
我在bash脚本中有一个循环,在这个循环中有一个变量,每次迭代都会变为某个字符串。
我需要能够保存到一些变量(可能是数组?)5最常见的字符串和它们的出现次数(第二个数组?),以便稍后在脚本中使用它。 / p>
这是我尝试的代码..
$table->string('email', 250)->unique();
如果main for循环将有10次迭代,并且会有字符串
last=0 #index of the last string in the array
for i in ...
do
string=... #this is changed each iteration
placed=0 #checks whether the string has already benn placed
index=0
while [ "$placed" -ne 1 ] #searches if the string is not places through the array ARRAY
do
if [ "$last" -eq "$index" ] ; then # this should place the string at the end if it is not in the arraz already
ARRAY[index]="$string"
OCCURENCE[index]=1
(( index++ ))
(( last++ ))
break
fi
if [ "$string" == "$ARRAY[$index]" ] ; then
# here i have another array with the occurences and increment the same index there
(( OCCURENCE[index]++ ))
placed=1
fi
(( index++ ))
done
done
我想要包含字符串的数组
"hello 1"
"hello 2"
"hello 3"
"hello 1"
"hello 1"
"hello 2"
"hello 4"
"hello 5"
"hello 6"
"hello 2"
并且出现数组
"hello 1"
"hello 2"
"hello 3"
"hello 4"
"hello 5"
"hello 6"
答案 0 :(得分:1)
简单地说:
#!/usr/bin/env bash
declare -A array
while read -r line
do
(( array["$line"]++ ))
done<input_file
for i in "${!array[@]}"
do
echo "$i has count of ${array[$i]}"
done
答案 1 :(得分:0)
我认为您想要解决的问题in this question。
解决方案是使用sort
和uniq
来获得所需的输出。
declare -a lines;
declare -a count;
while read -r line
do
lines+=(${line});
done < <(echo $list | sort | uniq | tr '\n' ' ') #prints the sorted lines
while read -r line
do
count+=(${line});
done < <(echo $list | sort | uniq --count | tr '\n' ' ') #prints the corresponding number of occurences
for ((i=0; i<${#lines[@]}; i=$i+1));
do
echo "${lines[i]} ${count[i]}"
done | sort -k2 -n -r | head -n 5; # should sort along the second column, and cut the 5 first elements.