我有以下shell脚本,它从命令行输入的文件中读取数据。该文件是一个数字矩阵,我需要按列分隔文件,然后对列进行排序。现在我可以读取文件并输出各列,但我对如何排序感到迷茫。我输入了一个排序语句,但它只排序第一列。
修改 我已经决定采用另一种方法并实际转置矩阵以将列转换为行。因为我必须稍后计算平均值和中位数,并且已经在脚本中早先成功地为行文件执行此操作 - 建议我尝试"旋转"如果要将列转换为行,则使用矩阵。
这是我的更新代码
declare -a col=( )
read -a line < "$1"
numCols=${#line[@]} # save number of columns
index=0
while read -a line ; do
for (( colCount=0; colCount<${#line[@]}; colCount++ )); do
col[$index]=${line[$colCount]}
((index++))
done
done < "$1"
for (( width = 0; width < numCols; width++ )); do
for (( colCount = width; colCount < ${#col[@]}; colCount += numCols ) ); do
printf "%s\t" ${col[$colCount]}
done
printf "\n"
done
这给了我以下输出:
1 9 6 3 3 6
1 3 7 6 4 4
1 4 8 8 2 4
1 5 9 9 1 7
1 5 7 1 4 7
虽然我现在正在寻找:
1 3 3 6 6 9
1 3 4 4 6 7
1 2 4 4 8 8
1 1 5 7 9 9
1 1 4 5 7 7
为了尝试对数据进行排序,我尝试了以下方法:
sortCol=${col[$colCount]}
eval col[$colCount]='($(sort <<<"${'$sortCol'[*]}"))'
另外:(这是我从行读入后对行进行排序的方式)
sortCol=( $(printf '%s\t' "${col[$colCount]}" | sort -n) )
如果您能对此提供任何见解,我们将不胜感激!
答案 0 :(得分:1)
注意,正如评论中所提到的,纯粹的bash解决方案并不漂亮。有很多方法可以做到,但这可能是最直接的。以下内容要求将每行的所有值读入数组,并保存矩阵stride
,以便将其转换为将所有列值读入行矩阵并进行排序。所有排序的列都插入到新的行矩阵a2
中。转置该行矩阵会以列排序顺序返回原始矩阵。
注意这适用于文件中任何列列的矩阵。
#!/bin/bash
test -z "$1" && { ## validate number of input
printf "insufficient input. usage: %s <filename>\n" "${0//*\//}"
exit 1;
}
test -r "$1" || { ## validate file was readable
printf "error: file not readable '%s'. usage: %s <filename>\n" "$1" "${0//*\//}"
exit 1;
}
## function: my sort integer array - accepts array and returns sorted array
## Usage: array=( "$(msia ${array[@]})" )
msia() {
local a=( "$@" )
local sz=${#a[@]}
local _tmp
[[ $sz -lt 2 ]] && { echo "Warning: array not passed to fxn 'msia'"; return 1; }
for((i=0;i<$sz;i++)); do
for((j=$((sz-1));j>i;j--)); do
[[ ${a[$i]} -gt ${a[$j]} ]] && {
_tmp=${a[$i]}
a[$i]=${a[$j]}
a[$j]=$_tmp
}
done
done
echo ${a[@]}
unset _tmp
unset sz
return 0
}
declare -a a1 ## declare arrays and matrix variables
declare -a a2
declare -i cnt=0
declare -i stride=0
declare -i sz=0
while read line; do ## read all lines into array
a1+=( $line );
(( cnt == 0 )) && stride=${#a1[@]} ## calculate matrix stride
(( cnt++ ))
done < "$1"
sz=${#a1[@]} ## calculate matrix size
## print original array
printf "\noriginal array:\n\n"
for ((i = 0; i < sz; i += stride)); do
for ((j = 0; j < stride; j++)); do
printf " %s" ${a1[i+j]}
done
printf "\n"
done
## sort columns from stride array
for ((j = 0; j < stride; j++)); do
for ((i = 0; i < sz; i += stride)); do
arow+=( ${a1[i+j]} )
done
a2+=( $(msia ${arow[@]}) ) ## create sorted array
unset arow
done
## print the sorted array
printf "\nsorted array:\n\n"
for ((j = 0; j < cnt; j++)); do
for ((i = 0; i < sz; i += cnt)); do
printf " %s" ${a2[i+j]}
done
printf "\n"
done
exit 0
<强>输出强>
$ bash sort_cols2.sh dat/matrix.txt
original array:
1 1 1 1 1
9 3 4 5 5
6 7 8 9 7
3 6 8 9 1
3 4 2 1 4
6 4 4 7 7
sorted array:
1 1 1 1 1
3 3 2 1 1
3 4 4 5 4
6 4 4 7 5
6 6 8 9 7
9 7 8 9 7
答案 1 :(得分:0)
Awk脚本
awk '
{for(i=1;i<=NF;i++)a[i]=a[i]" "$i} #Add to column array
END{
for(i=1;i<=NF;i++){
split(a[i],b) #Split column
x=asort(b) #sort column
for(j=1;j<=x;j++){ #loop through sort
d[j]=d[j](d[j]~/./?" ":"")b[j] #Recreate lines
}
}
for(i=1;i<=NR;i++)print d[i] #Print lines
}' file
1 1 1 1 1
3 3 2 1 1
3 4 4 5 4
6 4 4 7 5
6 6 8 9 7
9 7 8 9 7
答案 2 :(得分:0)
这是我参加这个小练习的内容。应该处理任意数量的列。我认为它们是空格分开的:
#!/bin/bash
linenumber=0
while read line; do
i=0
# Create an array for each column.
for number in $line; do
[ $linenumber == 0 ] && eval "array$i=()"
eval "array$i+=($number)"
(( i++ ))
done
(( linenumber++ ))
done <$1
IFS=$'\n'
# Sort each column
for j in $(seq 0 $i ); do
thisarray=array$j
eval array$j='($(sort <<<"${'$thisarray'[*]}"))'
done
# Print each array's 0'th entry, then 1, then 2, etc...
for k in $(seq 0 ${#array0[@]}); do
for j in $(seq 0 $i ); do
eval 'printf ${array'$j'['$k']}" "'
done
echo ""
done
答案 3 :(得分:0)
不是bash
,但我认为这个python
代码值得一看,展示如何使用内置函数实现此任务。
来自interpreter
:
$ cat matrix.txt
1 1 1 1 1
9 3 4 5 5
6 7 8 9 7
3 6 8 9 1
3 4 2 1 4
6 4 4 7 7
$ python
Python 2.7.3 (default, Jun 19 2012, 17:11:17)
[GCC 4.4.3] on hp-ux11
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> f = open('./matrix.txt')
>>> for row in zip(*[sorted(list(a))
for a in zip(*[a.split() for a in f.readlines()])]):
... print ' '.join(row)
...
1 1 1 1 1
3 3 2 1 1
3 4 4 5 4
6 4 4 7 5
6 6 8 9 7
9 7 8 9 7