我目前正在编写bash脚本,以找出服务器每小时的平均内存使用量,该脚本输出到.csv文件。将会发生的情况是,脚本将每10分钟运行一次,并且在一小时内运行六次后,我的.csv文件中的小时数将有6个不同的值,依此类推。
我想做的是使用脚本找出每个小时的平均值。
#date(YYYYMMDDHHmm) total used
201811270000 10 3
201811270010 10 4
201811270020 10 5
201811270030 10 9
201811270040 10 8
201811270050 10 2
201811270100 10 5
201811270110 10 1
201811270120 10 7
201811270130 10 6
201811270140 10 5
201811270150 10 2
201811270200 10 1
根据上面的输出,有谁知道我可以找到每个小时的平均值的方法吗?例如:
The average of hour 201811270000: 5.166666666666667
The average of hour 201811270100: 4.333333333333333
我该怎么办?
有可能这样做吗?
答案 0 :(得分:2)
尴尬
awk '
function calc() {
if (count) print "The average of hour " date ": " (sum/count);
count=0; sum=0; date=$1;
}
/^#/ {next} # throw away comment lines
$1~/00$/ {calc()} # full hour, time to calculate/reset variables
END {calc()} # end of file, ditto
{count+=1; sum+=$3;} # update variables at each line
' < file.txt
纯bash十分繁琐,因为您需要首先实现浮点运算库。 :)
答案 1 :(得分:0)
我将使用“ tr”将线修剪成较小的,空格分隔的块,“ cut”将我们计算平均值所需的部分删去。如果格式变得更复杂,您可以随时增强getFieldAtPosition
功能。
我在atm上没有完整的bash,因此我使用了一个数组进行迭代,而不是从文件输入中读取。 对于逐行读取文件的方法,您可以查看以下答案:
仅bash版本:
function average {
local sum=$1
local count=$2
local floatingPointUnits=2
# https://linux.die.net/man/1/dc
echo "${floatingPointUnits}k" "$sum" "$count" /p | dc
}
function getFieldAtPosition {
local line=$1
local position=$2
echo "$line" | tr -s ' ' | cut -d ' ' -f $position
}
function parseHourFromDate {
local date=$1
local positionOfHour=4+2+2
local lengthOfHour=2
echo ${date:positionOfHour:lengthOfHour}
}
lines=('201811270000 10 3 ' \
'201810270020 7 2 ' \
'201811270100 10 3 ' \
'201810270140 22 2 ' \
'201811271000 33 3 ' )
sum=0
count=0
declare -A HOURS
for line in "${lines[@]}"; do
date=`getFieldAtPosition "$line" 1`
number=`getFieldAtPosition "$line" 2`
hour=`parseHourFromDate "$date"`
# new hour, reset
if [ "$hour" != "$previousHour" ]; then
sum=0
count=0
fi
sum=$((sum+number))
count=$((count+1))
# save average in associative array
HOURS[$hour]=`average $sum $count`
previousHour=$hour
done
# print results
for key in "${!HOURS[@]}"; do
echo "Average of $key: ${HOURS[$key]}"
done
答案 2 :(得分:0)
使用Perl
> cat ivan.txt
201811270000 10 3
201811270010 10 4
201811270020 10 5
201811270030 10 9
201811270040 10 8
201811270050 10 2
201811270100 10 5
201811270110 10 1
201811270120 10 7
201811270130 10 6
201811270140 10 5
201811270150 10 2
201811270200 10 1
> perl -F'/\s+/' -lane ' { $F[0]=~s/..$//g;push @{$datekv{$F[0]}},$F[2];} END { for my $x (sort keys %datekv){ $total=0;$z=0; foreach(@{$datekv{$x}}) {$total+=$_;$z++ } print $x,"\t",$total/$z }}' ivan.txt
2018112700 5.16666666666667
2018112701 4.33333333333333
2018112702 1
>
答案 3 :(得分:0)
使用bash和bc计算:
PROCESS_FILE="file.txt"
PROCESSED_DATE=""
while read -r line; do
if [[ $line =~ ^# ]]; then
continue;
fi
LINE_DATE=${line:0:10}
if [[ $PROCESSED_DATE != *"$LINE_DATE"* ]]; then
PROCESSED_DATE+=","+$LINE_DATE
USED_LIST=$(grep $LINE_DATE $PROCESS_FILE | sed 's/ */,/g' | cut -d ',' -f3 | tr '\n' ' ')
COUNT=0;
SUM=0;
for USED in $USED_LIST; do
COUNT=$(echo "$COUNT + 1" | bc -l);
SUM=$(echo "$SUM + $USED" | bc -l);
done
if [ $COUNT -ne 0 ]; then
AVG=$(echo "$SUM/$COUNT" | bc -l)
fi
echo "The average of hour $LINE_DATE: $AVG"
fi
done < $PROCESS_FILE
答案 4 :(得分:-1)
在bash中,这是一种简短的方法(有点野蛮):
class WebTestView(PageMixin, FormView):
....
执行结果如下:
calc() {
awk "BEGIN { print "$*" }";
}
IFS=$'\r\n' GLOBIGNORE='*' command eval 'memory=($(<'$1'))'
for (( i = 0; i < ${#memory[@]}; i++ )); do
echo "${memory[i]}" | awk '{print $1" "$3}' >> values.txt
total=$(awk '{ (Values += $2) } END { printf "%0.0f", Values }' values.txt)
length=$(awk '{print $2}' values.txt | wc -l)
echo "The average of hour $(awk '{print $1}' values.txt | tail -n1): $(calc ${total}/${length})"
done
rm values.txt
您以后可以更改输出以将其转发到文件。 对于经验丰富的bash用户,还有更优雅的方法。
对于Paul Hodges:
Awk指向有问题的特定列,因为我们不知道该列的长度是否与文件的其余部分相同(仍然适用)。
tr -d是必需的,因为变量的值必须是整数而不是字符串(仅在命令行):
这是一个字符串:
ivo@spain-nuc-03:~/Downloads/TestStackoverflow$ ./processing.sh test.csv
The average of hour 201811270000: 3
The average of hour 201811270010: 3.5
The average of hour 201811270020: 4
The average of hour 201811270030: 5.25
The average of hour 201811270040: 5.8
The average of hour 201811270050: 5.16667
The average of hour 201811270100: 5.14286
The average of hour 201811270110: 4.625
The average of hour 201811270120: 4.88889
The average of hour 201811270130: 5
The average of hour 201811270140: 5
The average of hour 201811270150: 4.75
The average of hour 201811270200: 4.46154
ivo@spain-nuc-03:~/Downloads/TestStackoverflow$
这是整数:
ivo@spain-nuc-03:~/Downloads/ScriptsClientes/BashReports/Tools/TextProcessing$ cat values.txt | wc -l
13
ivo@spain-nuc-03:~/Downloads/ScriptsClientes/BashReports/Tools/TextProcessing$
另外,仅执行wc -l文件将返回以下内容(仍然适用):
ivo@spain-nuc-03:~/Downloads/ScriptsClientes/BashReports/Tools/TextProcessing$ cat values.txt | wc -l | tr -d '\n'
13ivo@spain-nuc-03:
根本不适合手头的任务,因为它会迫使您过滤掉文件名。
请确保在批评之前。