用gnuplot排序数据

时间:2019-01-11 09:51:36

标签: sorting gnuplot columnsorting

有时可能需要对数据进行排序。不幸的是,据我所知,gnuplot没有提供这种可能性。当然,您可以使用awk,Perl,Python等外部工具。但是,为了最大程度地保证平台独立性并避免安装其他程序和相关复杂性,以及出于好奇,我对gnuplot是否可以进行某种排序感到很感兴趣。 对于改进,限制方面的意见,我将不胜感激。

有人知道如何仅使用gnuplot对字母数字数据进行排序吗?

### Sorting with gnuplot
reset session

# generate some random example data
N = 10
set samples N
RandomNo(n) = sprintf("%.02f",rand(0)*n)
set table $Data
    plot '+' u (RandomNo(10)):(RandomNo(10)):(RandomNo(10)) w table
unset table
print $Data

# Settings for sorting
ColNo = 2   # ColumnNo for sorting
stats $Data nooutput      # get the number of rows if data is from file
RowCount = STATS_records  # with the example data above, of course RowCount=N

# create the sortkey and put it into an array
array SortKey[RowCount]
set table $Dummy
    plot $Data u (SortKey[$0+1] = sprintf("%.06f%02d",column(ColNo),$0+1)) w table
unset table
# print $Dummy

# get lines as whole into array
set datafile separator "\n"
array DataSeq[RowCount]
set table $Dummy2
    plot $Data u (SortKey[$0+1]):(DataSeq[$0+1] = stringcolumn(1)) with table
unset table
print $Dummy2
set datafile separator whitespace

# do the actual sorting with 'smooth unique'
set table $Dummy3
    plot $Dummy2 u 1:0 smooth unique
unset table
# print $Dummy3

# extract the sorted sortkeys
set table $Dummy4
    plot $Dummy3 u (SortKey[$0+1]=$2) with table
unset table
# print $Dummy4

# create the table with sorted lines
set table $DataSorted
    plot $Data u (DataSeq[SortKey[$0+1]+1]) with table
unset table
print $DataSorted
### end of code
  • 第一个数据块未排序的数据
  • 带有排序键的第二个数据块中间
  • 第三列按第二列对数据进行排序

输出:

 5.24    6.68    3.09   
 1.64    1.27    9.82   
 6.44    9.23    7.03   
 8.14    8.87    3.82   
 4.27    5.98    0.93   
 7.96    3.64    6.15   
 6.21    6.28    6.17   
 1.52    3.17    3.58   
 4.24    2.16    8.99   
 8.73    6.54    1.13   

 6.68000001      5.24    6.68    3.09
 1.27000002      1.64    1.27    9.82
 9.23000003      6.44    9.23    7.03
 8.87000004      8.14    8.87    3.82
 5.98000005      4.27    5.98    0.93
 3.64000006      7.96    3.64    6.15
 6.28000007      6.21    6.28    6.17
 3.17000008      1.52    3.17    3.58
 2.16000009      4.24    2.16    8.99
 6.54000010      8.73    6.54    1.13

 1.64    1.27    9.82
 4.24    2.16    8.99
 1.52    3.17    3.58
 7.96    3.64    6.15
 4.27    5.98    0.93
 6.21    6.28    6.17
 8.73    6.54    1.13
 5.24    6.68    3.09
 8.14    8.87    3.82
 6.44    9.23    7.03 

1 个答案:

答案 0 :(得分:0)

出于好奇,我想知道是否只能用gnuplot代码实现字母数字排序。 这避免了对外部工具的需求,并确保了最大的平台兼容性。 我还没有听说过可以辅助gnuplot并在Windows Linux MacOS下运行的外部工具。 我很乐意就错误,简化,改进,性能比较和限制提出意见和建议。

对于字母数字排序,第一步是字母数字字符串比较,据我所知,它不直接存在于gnuplot中。因此,第一部分Compare.plt是关于字符串的比较。

### compare function for strings 
# Compare.plt
# function cmp(a,b,cs) returns a<b:-1, a==b:0, a>b:+1
# cs=0: case-insensitive, cs=1: case-sensitive
reset session

ASCII =  ' !"' . "#$%&'()*+,-./0123456789:;<=>?@".\
         "ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_\`".\
         "abcdefghijklmnopqrstuvwxyz{|}~"

ord(c) = strstrt(ASCII,c)>0 ? strstrt(ASCII,c)+31 : 0

# comparing char: case-sensitive
cmpcharcs(c1,c2) = sgn(ord(c1)-ord(c2))

# comparing char: case-insentitive
cmpcharci(c1,c2) = sgn(( cmpcharci_o1=ord(c1), ((cmpcharci_o1>96) && (cmpcharci_o1<123)) ?\
    cmpcharci_o1-32 : cmpcharci_o1) - \
    ( cmpcharci_o2=ord(c2), ((cmpcharci_o2>96) && (cmpcharci_o2<123)) ?\
    cmpcharci_o2-32 : cmpcharci_o2) )

# function cmp returns a<b:-1, a==b:0, a>b:+1
# cs=0: case-insensitive, cs=1: case-sensitive
cmp(a,b,cs) = ((cmp_r=0, cmp_flag=0, cmp_maxlen=strlen(a)>strlen(b) ? strlen(a) : strlen(b)),\
    (sum[cmp_i=1:cmp_maxlen] \
      ((cmp_flag==0 && (cmp_c1 = substr(a,cmp_i,cmp_i), cmp_c2 = substr(b,cmp_i,cmp_i), \
        (cmp_r = (cs==0 ?  cmpcharci(cmp_c1,cmp_c2) : cmpcharcs(cmp_c1,cmp_c2) ) )!=0 ? \
        (cmp_flag=1, cmp_r) : 0)), 1 )), cmp_r)

cmpsymb(a,b,cs) = (cmpsymb_r = cmp(a,b,cs))<0 ? "<" : cmpsymb_r>0 ? ">" : "="
### end of code

示例:

### example compare strings
load "Compare.plt"

a="Alligator"
b="Tiger"
print sprintf("% 2d: % 9s% 2s% 6s", cmp(a,b,0), a, cmpsymb(a,b,0), b)

a="Tiger"
print sprintf("% 2d: % 9s% 2s% 6s", cmp(a,b,0), a, cmpsymb(a,b,0), b)

a="Zebra"
print sprintf("% 2d: % 9s% 2s% 6s", cmp(a,b,0), a, cmpsymb(a,b,0), b)
### end of code

结果:

-1: Alligator < Tiger
 0:     Tiger = Tiger
 1:     Zebra > Tiger

第二部分利用比较进行排序。

### alpha-numerical sort with gnuplot
reset session
load "Compare.plt"

$Data <<EOD
1   0.123   Orange
2   0.456   Apple
3   0.789   Peach
4   0.987   Pineapple
5   0.654   Banana
6   0.321   Raspberry
7   0.111   Lemon
EOD

stats $Data u 0 nooutput
RowCount = STATS_records
ColSort = 3

array Key[RowCount]
array Index[RowCount]

set table $Dummy
    plot $Data u (Key[$0+1]=stringcolumn(ColSort),Index[$0+1]=$0+1) w table
unset table

# Bubblesort
do for [n=RowCount:2:-1] {
    do for [i=1:n-1] {
        if ( cmp(Key[i],Key[i+1],0) > 0) { 
            tmp=Key[i]; Key[i]=Key[i+1]; Key[i+1]=tmp
            tmp2=Index[i]; Index[i]=Index[i+1]; Index[i+1]=tmp2
        }
    }
}

set datafile separator "\n"
set table $Dummy    # and reuse Key-array
    plot $Data u (Key[$0+1]=stringcolumn(1)) with table
unset table
set datafile separator whitespace

set table $DataSorted
    plot $Data u (Key[Index[$0+1]]) with table
unset table

print $DataSorted
set grid xtics,ytics
plot [-0.5:RowCount-0.5][0:1.1] $DataSorted u 0:2:xtic(3) w lp lt 7 lc rgb "red"
### end of code

输入:

1   0.123   Orange
2   0.456   Apple
3   0.789   Peach
4   0.987   Pineapple
5   0.654   Banana
6   0.321   Raspberry
7   0.111   Lemon

输出:

 2      0.456   Apple   
 5      0.654   Banana  
 7      0.111   Lemon   
 1      0.123   Orange  
 3      0.789   Peach   
 4      0.987   Pineapple       
 6      0.321   Raspberry  

和输出图:

enter image description here