Shell:如何从解析的数据(数字)中创建文本模式条形图?

时间:2015-06-19 02:33:34

标签: bash charts

我正在开发一个Linux的Bash shell脚本,从文本文件中提取数据,只留下数字。 这些是我的示例解析数据:

3
4
4
5
6
7
8
8
9
11

我想像这样创建一个简单的文本模式条形图,但对应于这些值:

Bar chart

详细说明:

  • 我需要图表图表垂直
  • 第一个数字应显示在左侧,最新的右侧
  • n(解析后的数字)字符高列适合我。所以我的例子左边的第一个栏应该是3个字符高,第二个4,第三个4,第四个5,依此类推。

更确切地说,对于这个例子,某些东西(使用字符)如:

         █
         █
        ██ 
       ███
      ████
     █████
    ██████
   ███████
 █████████
██████████
██████████
██████████

注意第一(左)列为3个字符,最后(右)列为11个字符 使用$字符的相同示例,以使其更具可读性:

         $
         $
        $$
       $$$
      $$$$
     $$$$$
    $$$$$$
   $$$$$$$
 $$$$$$$$$
$$$$$$$$$$
$$$$$$$$$$
$$$$$$$$$$

我最接近的是进度条的方法,直到现在,我已经在另一个脚本中使用过了:

printf "\033[48;5;21m"   # Blue background color
for i in $(seq 1 $n); do printf " "; done   # Create bar using blue spaces

这是:填写每行打印一个n个空格的栏。但是这个条是水平的,所以在这种情况下是不合适的。

我请求一些核心循环示例提示来创建此条形图。

根据用户Boardrider的建议,可以接受基于任何类似unix的工具的解决方案。 基于脚本语言(如Perl或Python)的Linux shell解决方案也被接受,只要它们用于在许多设备上实现。

5 个答案:

答案 0 :(得分:4)

这是第一个天真的尝试......由于数据被多次解析,它不是一个非常有效的解决方案,但它可能会有所帮助。在某种程度上,它是@Walter_A建议的第一个循环理念。

#!/bin/sh
#
## building a vertical bar graph of data file
## https://stackoverflow.com/q/30929012
##
## 1. required. Data file with one value per line and nothing else!
## /!\ provide the (relative or absolute) file path, not file content
: ${1:?" Please provide a file name"}
test -e "$1" || { echo "Sorry, can't find $1" 1>&2 ; exit 2 ; }
test -r "$1" || { echo "Sorry, can't access $1" 1>&2 ; exit 2 ; }
test -f "$1" || { echo "Sorry, bad format file $1" 1>&2 ; exit 2 ; }
test $( grep -cv '^[0-9][0-9]*$' "$1" 2>/dev/null ) -ne 0 || { echo "Sorry, bad data in $1" 1>&2 ; exit 3 ; }
# setting characters
## 2. optional. Ploting character (default is Dollar sign)
## /!\ for blank color use "\033[48;5;21m \033[0m" or you'll mess...
c_dot="$2"
: ${c_dot:='$'}
## 3. optional. Separator characher (default is Dash sign)
## /!\ as Space is not tested there will be extra characters...
c_sep="$3"
: ${c_sep:='-'}
# init...
len_w=$(wc -l "$1" | cut -d ' ' -f 1 )
l_sep=''
while test "$len_w" -gt 0
do
        l_sep="${l_sep}${c_sep}";
        len_w=$(($len_w-1))
done
unset len_w
# part1: chart
echo ".${c_sep}${l_sep}${c_sep}."
len_h=$(sort -n "$1" | tail -n 1)
nbr_d=${#len_h}
while test "$len_h" -gt 0
do
        printf '| '
        for a_val in $(cat "$1")
        do
                test "$a_val" -ge "$len_h" && printf "$c_dot" || printf ' '
        done
        echo ' |'
        len_h=$(($len_h-1))
done
unset len_h
# part2: legend
echo "|${c_sep}${l_sep}${c_sep}|"
while test "$nbr_d" -gt 0
do
        printf '| '
        for a_val in $(cat "$1")
        do
                printf "%1s" $(echo "$a_val" | cut -c "$nbr_d")
        done
        echo ' |'
        nbr_d=$(($nbr_d-1))
done
unset nbr_d
# end
echo "'${c_sep}${l_sep}${c_sep}'"
unset c_sep
exit 0

编辑1:这是对脚本的返工。它纠正了分隔符处理(只是尝试使用'' |'作为第三个参数来查看),但它可能看起来不太可读,因为我使用参数编号而不是其他变量。

编辑2:它还处理负整数......你可以改变基础(第五个参数)

#!/bin/sh
#
## building a vertical bar graph of data file
## https://stackoverflow.com/q/30929012
##
## 1. required. Data file with one value per line and nothing else!
## /!\ provide the (relative or absolute) file path, not file content
: ${1:?" Please provide a file name"}
[ -e "$1" ] || { echo "Sorry, can't find $1" 1>&2 ; exit 2 ; }
[ -r "$1" ] || { echo "Sorry, can't access $1" 1>&2 ; exit 2 ; }
[ -f "$1" ] || { echo "Sorry, bad format file $1" 1>&2 ; exit 2 ; }
[ $( grep -cv '^[-0-9][0-9]*$' "$1" 2>/dev/null ) -ne 0 ] || { echo "Sorry, bad data in $1" 1>&2 ; exit 3 ; }
## /!\ following parameters should result to a single character
## /!\ for blank color use "\033[48;5;21m \033[0m" or you'll mess...
## 2. optional. Ploting character (default is Dollar sign)
## 3. optional. Horizontal border characher (default is Dash sign)
## 4. optional. Columns separator characher (default is Pipe sign)
## (!) however, when no arg provided the graph is just framed in a table
## 5. optional. Ground level integer value (default is Zero)
test "${5:-0}" -eq "${5:-0}" 2>/dev/null || { echo "oops, bad parameter $5" 1>&2 ; exit 3 ; }
# init...
_long=$(wc -l < "$1" ) # width : number of data/lines in file
if [ -n "$4" ]
then
        _long=$((_long*2-3))
fi
_line=''
while [ "$_long" -gt 0 ]
do
        _line="${_line}${3:--}"
        _long=$((_long-1))
done
unset _long
_from=$(sort -n "$1" | tail -n 1 ) # max int
_stop=$(sort -n "$1" | head -n 1 ) # min int

这种返工有两种风格。第一个产生类似于前一个的输出。

# begin
echo "${4-.}${3:--}${_line}${3:--}${4-.}"
# upper/positive
if [ $_from -gt ${5:-0} ]
then
        while [ $_from -gt ${5:-0} ]
        do
                printf "${4:-| }"
                for _cint in $(cat "$1" )
                do
                        if [ $_cint -ge $_from ]
                        then
                                printf "${2:-$}$4"
                        else
                                printf " $4"
                        fi
                done
                echo " ${4:-|}"
                _from=$((_from-1))
        done
        echo "${4-|}${3:--}${_line}${3:--}${4-|}"
fi
unset _from
# center/legend
_long=$(wc -L < "$1" ) # height : number of chararcters in longuest line...
while [ $_long -ge 0 ]
do
        printf "${4:-| }"
        for _cint in $(cat "$1" )
        do
                printf "%1s$4" $(echo "$_cint" | cut -c "$_long" )
        done
        echo " ${4:-|}"
        _long=$((_long-1))
done
unset _long
# lower/negative
if [ $_stop -lt ${5:-0} ]
then
        _from=${5:-0}
        echo "${4-|}${3:--}${_line}${3:--}${4-|}"
        while [ $_from -gt $_stop ]
        do
                printf "${4:-| }"
                for _cint in $(cat "$1" )
                do
                        if [ $_cint -lt $_from ]
                        then
                                printf "${2:-$}$4"
                        else
                                printf " $4"
                        fi
                done
                echo " ${4:-|}"
                _from=$((_from-1))
        done
fi
unset _stop
# end
echo "${4-'}${3:--}${_line}${3:--}${4-'}"
exit 0

注意:当所有值都是正值(高于地面)或负值(低于地面)时,需要进行两次检查以避免额外的循环! 好吧,也许我应该总是把#34;中心/传奇&#34;部分到底?当首先出现正负值时,它看起来有点难看,而当只有负整数时,看起来很奇怪标签不会在相反的情况下读取并且有令人不愉快的减号。 另请注意wc -L is not POSIX... ...因此可能需要另一个循环。

这是另一种变体,其中图例编号的大小不正确,而不是底部。 这样做,我节省了一个额外的循环,但我不喜欢输出(我更喜欢左边而不是右边的值,但它的味道不是吗?)

# begin
printf "${4-.}${3:--}${_line}${3:--}${4-.}"
# upper/positive
if [ $_from -gt ${5:-0} ]
then
        echo ""
        while [ $_from -gt ${5:-0} ]
        do
                _ctxt=''
                printf "${4:-| }"
                for _cint in $(cat "$1" )
                do
                        if [ $_cint -ge $_from ]
                        then
                                printf "${2:-$}$4"
                        else
                                printf " $4"
                        fi
                        if [ $_cint -eq $_from ]
                        then
                                _ctxt="_ $_from"
                        fi
                done
                echo " ${4:-}${_ctxt}"
                _from=$((_from-1))
        done
        _from=$((_from+1))
else
        echo "_ ${1}"
fi
# center/ground
if [ $_stop -lt ${5:-0} ] && [ $_from -gt ${5:-0} ]
then
        echo "${4-|}${3:--}${_line}${3:--}${4-|}_ ${1}"
fi
# lower/negative
if [ $_stop -lt ${5:-0} ]
then
        _from=${5:-0}
        while [ $_from -gt $_stop ]
        do
                _ctxt=''
                printf "${4:-| }"
                for _cint in $(cat "$1" )
                do
                        if [ $_cint -lt $_from ]
                        then
                                printf "${2:-$}$4"
                        else
                                printf " $4"
                        fi
                        if [ $_cint -eq $((_from-1)) ]
                        then
                                _ctxt="_ $_cint"
                        fi
                done
                echo " ${4:-|}${_ctxt}"
                _from=$((_from-1))
        done
fi
# end
unset _from
printf "${4-'}${3:--}${_line}${3:--}${4-'}"
if [ $_stop -lt ${5:-0} ]
then
        echo ""
else
        echo "_ ${1}"
fi
unset _stop
exit 0

编辑3:还有一些额外的检查,因此当只有正数或负数时,不会添加额外的地线。

最后,我认为最终的解决方案是两者的混合,其中值显示在侧面,值的位置显示在中心。然后它更接近GNU Plot的输出。

# init...
_long=$(wc -l < "$1" )
if [ -n "$4" ]
then
        _long=$((_long*2-3))
fi
_line=''
while [ $_long -gt 0 ]
do
        _line="${_line}${3:--}"
        _long=$((_long-1))
done
unset _long
_from=$(sort -n "$1" | tail -n 1 ) # max int
_stop=$(sort -n "$1" | head -n 1 ) # min int
# begin
echo "${4-.}${3:--}${_line}${3:--}${4-.}"
# upper/positive
if [ $_from -gt ${5:-0} ]
then
        while [ $_from -gt ${5:-0} ]
        do
                _ctxt=''
                printf "${4:-| }"
                for _cint in $(cat "$1" )
                do
                        if [ $_cint -ge $_from ]
                        then
                                printf "${2:-$}$4"
                        else
                                printf " $4"
                        fi
                        if [ $_cint -eq $_from ]
                        then
                                _ctxt="_ $_from"
                        fi
                done
                echo " ${4:-|}$_ctxt"
                _from=$((_from-1))
        done
        echo "${4-|}${3:--}${_line}${3:--}${4-|}"
fi
# center/ground
_size=$(wc -l < "$1" ) # width : number of data/lines in file
##_long=${#_size} # height : number of chararcters in long
#_long=1
##while [ $_long -gt 0 ]
#while [ $_long -le ${#_size} ]
#do
       #_rank=1
       #printf "${4:-| }"
       #while [ $_rank -le $_size ]
       #do
               #printf "%1s$4" $( printf "%0${#_size}d" $_rank  | cut -c $_long )
               #_rank=$((_rank+1))
       #done
       #printf " ${4:-|}"
       ##_long=$((_long-1))
       #_long=$((_long+1))
       ##if [ $_long -eq 0 ]
       #if [ $_long -eq ${#_size} ]
       #then
               #printf "_ ${1}"
       #fi
       #echo ''
#done
_rank=1
printf "${4:-| }"
while [ $_rank -le $_size ]
do
        printf "%1s$4" $( expr "$_rank" : '.*\(.\)$' )
        _rank=$((_rank+1))
done
echo " ${4:-|}_ ${1}"
# lower/negative
if [ $_stop -lt ${5:-0} ]
then
        echo "${4-|}${3:--}${_line}${3:--}${4-|}"
        while [ $_from -gt $_stop ]
        do
                _ctxt=''
                printf "${4:-| }"
                for _cint in $(cat "$1" )
                do
                        if [ $_cint -lt $_from ]
                        then
                                printf "${2:-$}${4}"
                        else
                                printf " $4"
                        fi
                        if [ $_cint -eq $((_from-1)) ]
                        then
                                _ctxt="_ $_cint"
                        fi
                done
                echo " ${4:-|}$_ctxt"
                _from=$((_from-1))
        done
fi
unset _from
unset _stop
# end
echo "${4-'}${3:--}${_line}${3:--}${4-'}"
exit 0

最后一项改进是扩展能力......

答案 1 :(得分:2)

您可以安装专用工具进行绘图...... 我只知道GNU-Plot,但它很大(有许多依赖项,包括ImageMagick)

root@localhost: # apt-get install gnuplot-nox
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  fontconfig fontconfig-config fonts-droid fonts-liberation ghostscript groff
  gsfonts hicolor-icon-theme imagemagick imagemagick-common libcairo2 libcroco3
  libcupsimage2 libdatrie1 libdjvulibre-text libdjvulibre21 libexiv2-12 libffi5
  libfontconfig1 libgd2-noxpm libgdk-pixbuf2.0-0 libgdk-pixbuf2.0-common
  libglib2.0-0 libglib2.0-data libgs9 libgs9-common libice6 libijs-0.35
  libilmbase6 libjasper1 libjbig0 libjbig2dec0 libjpeg8 liblcms1 liblcms2-2
  liblensfun-data liblensfun0 liblqr-1-0 libltdl7 liblua5.1-0 libmagickcore5
  libmagickcore5-extra libmagickwand5 libnetpbm10 libopenexr6 libpango1.0-0
  libpaper-utils libpaper1 libpixman-1-0 libpng12-0 librsvg2-2 librsvg2-common
  libsm6 libthai-data libthai0 libtiff4 libwmf0.2-7 libxaw7 libxcb-render0
  libxcb-shm0 libxft2 libxmu6 libxpm4 libxrender1 libxt6 netpbm poppler-data
  psutils shared-mime-info ttf-dejavu-core ufraw-batch x11-common
Suggested packages:
  ghostscript-cups ghostscript-x hpijs gnuplot-doc imagemagick-doc autotrace
  cups-bsd lpr lprng curl enscript ffmpeg gimp gnuplot grads hp2xx html2ps
  libwmf-bin mplayer povray radiance sane-utils texlive-base-bin transfig
  xdg-utils exiv2 libgd-tools libjasper-runtime liblcms-utils liblcms2-utils
  ttf-baekmuk ttf-arphic-gbsn00lp ttf-arphic-bsmi00lp ttf-arphic-gkai00mp
  ttf-arphic-bkai00mp librsvg2-bin poppler-utils fonts-japanese-mincho
  fonts-ipafont-mincho fonts-japanese-gothic fonts-ipafont-gothic
  fonts-arphic-ukai fonts-arphic-uming fonts-unfonts-core ufraw
The following NEW packages will be installed:
  fontconfig fontconfig-config fonts-droid fonts-liberation ghostscript
  gnuplot-nox groff gsfonts hicolor-icon-theme imagemagick imagemagick-common
  libcairo2 libcroco3 libcupsimage2 libdatrie1 libdjvulibre-text libdjvulibre21
  libexiv2-12 libffi5 libfontconfig1 libgd2-noxpm libgdk-pixbuf2.0-0
  libgdk-pixbuf2.0-common libglib2.0-0 libglib2.0-data libgs9 libgs9-common
  libice6 libijs-0.35 libilmbase6 libjasper1 libjbig0 libjbig2dec0 libjpeg8
  liblcms1 liblcms2-2 liblensfun-data liblensfun0 liblqr-1-0 libltdl7 liblua5.1-0
  libmagickcore5 libmagickcore5-extra libmagickwand5 libnetpbm10 libopenexr6
  libpango1.0-0 libpaper-utils libpaper1 libpixman-1-0 libpng12-0 librsvg2-2
  librsvg2-common libsm6 libthai-data libthai0 libtiff4 libwmf0.2-7 libxaw7
  libxcb-render0 libxcb-shm0 libxft2 libxmu6 libxpm4 libxrender1 libxt6 netpbm
  poppler-data psutils shared-mime-info ttf-dejavu-core ufraw-batch x11-common
0 upgraded, 73 newly installed, 0 to remove and 0 not upgraded.
Need to get 38.3 MB of archives.
After this operation, 111 MB of additional disk space will be used.
Do you want to continue [Y/n]?

(要求“gnuplot”提供相同的包加“gnuplot-nox”) 安装完成后,我会交互式地检查一个名为data的数据文件:

Terminal type set to 'unknown'
gnuplot>  set term dumb
Terminal type set to 'dumb'
Options are 'feed size 79, 24'
gnuplot> plot data
undefined variable: data
gnuplot> plot "data"
gnuplot> # nice ASCII-chart not shown here, curve of "A"
gnuplot> set style data histogram
gnuplot> plot "data"
gnuplot> # nice ASCII-chart not shown here, bars of "**"
gnuplot> quit

要快速入门,请查看http://www.microsofttranslator.com/bv.aspx?from=&to=nl&a=http%3A%2F%2Fwww.imdb.com%2F 然后this lowrank.net page
用法histograms demo 加上here来编写野兽脚本 例如:

#!/bin/sh
#
_max=$(sort -n "$1" | tail -n 1)
_min=$(sort -n "$1" | head -n 1)
_nbr=$(wc -l < "$1" )
gnuplot << EOF
set terminal dumb ${COLUMNS:-$(tput cols)},${LINES:-$(tput lines)}
set style data histogram
set xrange[-1:$(_nbr+1)]
set yrange[$(_min-3):$(_max+2)]
set xlabel "something to display below horizontal axis"
plot "$1" title "a nice title in the corner instead of filename"
EOF

只需更改选项即可将真实图形输出到文件中 当设置的选项很少时:

#!/bin/sh
#
gnuplot -e "set terminal dumb; set style data histogram; plot '$1' "

如果只绘制少量数据,那么与shell脚本相比可能看起来有些过分。 但是使用这样的工具对于大量数据变得非常有用(它运行得更快) 或者有一些变化(GNUPlot转义空白行,绘制正负整数和小数,使用多字段文件,合并同一图表中的许多文件或列)
最后,有一个Ruby包装器来管道:here ;和另一个可能看起来更容易的Perl前端:eplot

编辑:默认情况下,它使用屏幕尺寸减去边距......

答案 2 :(得分:0)

循环理念1: 搜索最高数字并记住行数 从值=最高数字开始,结束检查所有值-ge ${value}。这些将被填补,其他空间。下一行您使用(( value = value - 1 )) 效率不高,您将多次浏览解析后的数据。

循环理念2: 从数据中创建“xxx”和“xxxxxxxx”等字符串(并记住最大值)。你有你的图表水平) 使用printf格式化为每行添加空格(你有一个文件,所有行的长度相同) 搜索一种转动文件的方法。

循环思路3: 当你有m值为m作为最高值时,首先用。创建一个文件 m行k个连续数字(1 2 3 ...)并以空格结束该行 循环遍历值并将数字替换为右侧的“x”。 对于第3行中的值6,这将类似于

(( replace_from = m - 5 ))
sed -i ${replace_from}',$ s/ 6 / x /g' myfile

循环后用空格替换所有数字。

答案 3 :(得分:0)

有很多Python的图形库。 StackOverflow:915940列出了大部分内容。 不过我认为他们是面向X的...... 但有些人为控制台/终端编写了替代解决方案。我发现了两个:
- pysparklines 它的长者spark 使用Unicode字符绘制输入的小直方图 (所以它很好但有限,因为它是一条线,并且在X下用UTF-8渲染得更好)
- bashplotlib 尽管它的名字是一个pythonic集合,可以绘制直方图和散点图 在纯终端/控制台中以类似于Gnu Plot的方式。

答案 4 :(得分:0)

它可以让它更简单......

#!/bin/bash

# Asume the data is in a textfile "/tmp/testdata.txt". One value per line.

MAXVALUE=$( sort -nr /tmp/testdata.txt | head -n1 )     # get the highest value
CHARTHEIGHT=$MAXVALUE                                   # use it for height of chart 
                                                        # (can be other value)
CHARTLINE=$CHARTHEIGHT                                  # variable used to parse the lines

while [ $CHARTLINE -gt 0 ]; do                          # start the first line
    (( CHARTLINE-- ))                                    
    REDUCTION=$(( $MAXVALUE*$CHARTLINE/$CHARTHEIGHT ))  # subtract this from the VALUE
    while read VALUE; do
        VALUE=$(( $VALUE-$REDUCTION ))
        if [ $VALUE -le 0 ]; then                       # check new VALUE
             echo -n "    " 
        else
             echo -n "▓▓▓ "
        fi
    done < /tmp/testdata.txt
    echo
done
echo
exit

此脚本将解析每行的数据,降低读取值并检查是否还有剩余的内容。如果是,则显示一个块;如果没有,请显示空格。使用不同的REDUCTION值重复每一行。 该脚本可以扩展为包括图例,颜色,半/四分之一块等......

除了SORT和HEAD之外,所有这些都在BASH命令中