我的代码检查文件和显示中所有单词的频率,但我想知道如何只显示长度大于变量k的单词。 这是我的代码:
#!/bin/bash
if [ $# -eq 0 ]; then
echo "you need an argument"
exit 2
fi
echo "Insert k"
read k
for file in $@; do
if ! [ -f $file ]; then
echo "Not a file"
exit 2
fi
sed -e 's/\s/\n/g' < $file | sort | uniq -c | sort -nr
done
文件内容:
ceva
ceva
aiurea
sebi
este
cel
mai
smecher
输出:
2 ceva
1 smecher
1 sebi
1 mai
1 este
1 cel
1 aiurea
答案 0 :(得分:3)
使用awk
计算字长大于变量的频率:
awk -v k=3 'length() > k { freq[$0]++} END{for (i in freq) print freq[i], i}' file |
sort -rn
2 ceva
1 smecher
1 sebi
1 este
1 aiurea
完整脚本:
#!/usr/bin/env bash
if [[ $# -eq 0 ]]; then
echo "you need an argument"
exit 2
fi
read -p "Insert k: " k
for file in "$@"; do
if [[ ! -f $file ]]; then
echo "$file is not a file"
exit 2
fi
echo "$file:"
awk -v k=$k 'length()>k{freq[$0]++} END{for (i in freq) print freq[i], i}' "$file" | sort -rn
done
答案 1 :(得分:1)
你也可以这样做。
#!/bin/bash
while read -r line; do
arr+=("$line")
done< <(tr ' ' '\n' < $file | sort | uniq -c | awk '{print $2" "$1}')
for a in "${arr[@]}"; do
count=$(echo $a|awk '{print $2}')
if (( count > 2 )); then
echo $a
fi
done