如何在不重复bash的情况下遍历值对?

时间:2019-07-06 07:09:03

标签: bash loops

我正在使用一个特定的程序,该程序要求我通过使用索引指定对来检查文本文件中的变量对。

例如:

gcta  --reml-bivar 1 2 --grm test  --pheno test.phen  --out test

其中1和2对应于文本文件中前两列的值。如果我有50列,并且想不重复地检查每一对(1&2、2&3、1&3 ... 50),那么通过遍历此过程来实现此目的的最佳方法是什么?因此,基本上,脚本将执行同一命令,但采用成对的索引,例如:

gcta  --reml-bivar 1 3 --grm test  --pheno test.phen  --out test
gcta  --reml-bivar 1 4 --grm test  --pheno test.phen  --out test

...依此类推。谢谢!

5 个答案:

答案 0 :(得分:0)

  

1和2对应于文本文件中前两列中的值。

     

每对都没有重复

因此,让我们逐步完成此过程:

  1. 我们将文件中的第一列重复乘以文件长度
  2. 我们从文件的第二列开始重复每个值(每行)乘以文件长度
  3. 我们加入重复的列->我们拥有所有组合
  4. 我们需要过滤“重复”,我们可以将文件与原始文件合并并过滤出重复列
  5. 所以我们得到的每一对都没有重复。
  6. 然后,我们只是逐行读取文件。

脚本:

# create an input file cause you didn't provide any
cat << EOF > in.txt
1 a
2 b
3 c
4 d
EOF

# get file length
inlen=$(<in.txt wc -l)

# join the columns
paste -d' ' <(
  # repeat the first column inlen times
  # https://askubuntu.com/questions/521465/how-can-i-repeat-the-content-of-a-file-n-times
  seq "$inlen" |
  xargs -I{} cut -d' ' -f1 in.txt
) <(
  # repeat each line inlen times
  # https://unix.stackexchange.com/questions/81904/repeat-each-line-multiple-times
  awk -v IFS=' ' -v v="$inlen" '{for(i=0;i<v;i++)print $2}' in.txt
) |
# filter out repetitions - ie. filter original lines from the file
sort |
comm --output-delimiter='' -3 <(sort in.txt) - |
# read the file line by line
while read -r one two; do
  echo "$one" "$two"
done

将输出:

1 b
1 c
1 d
2 a
2 c
2 d
3 a
3 b
3 d
4 a
4 b
4 c

答案 1 :(得分:0)

如果我对您的理解正确,并且您不需要像“ 1 1”,“ 2 2”,...和“ 1 2”,“ 2 1”的配对,请尝试以下脚本

#!/bin/bash

for i in $(seq 1 49);
do
    for j in $(seq $(($i + 1)) 50);
    do gcta --reml-bivar "$i $j" --grm test --pheno test.phen --out test
done;

done;

答案 2 :(得分:0)

由于您没有向我们显示任何示例输入,我们只是在猜测,但是如果您输入的是数字列表(从文件中提取或以其他方式提取),则可以采用以下方法:

$ cat combinations.awk
###################
# Calculate all combinations of a set of strings, see
# See https://rosettacode.org/wiki/Combinations#AWK
###################

function get_combs(A,B, i,n,comb) {
    ## Default value for r is to choose 2 from pool of all elements in A.
    ## Can alternatively be set on the command line:-
    ##    awk -v r=<number of items being chosen> -f <scriptname>
    n = length(A)
    if (r=="") r = 2

    comb = ""
    for (i=1; i <= r; i++) { ## First combination of items:
        indices[i] = i
        comb = (i>1 ? comb OFS : "") A[indices[i]]
    }
    B[comb]

    ## While 1st item is less than its maximum permitted value...
    while (indices[1] < n - r + 1) {
        ## loop backwards through all items in the previous
        ## combination of items until an item is found that is
        ## less than its maximum permitted value:
        for (i = r; i >= 1; i--) {
            ## If the equivalently positioned item in the
            ## previous combination of items is less than its
            ## maximum permitted value...
            if (indices[i] < n - r + i) {
                ## increment the current item by 1:
                indices[i]++
                ## Save the current position-index for use
                ## outside this "for" loop:
                p = i
                break}}
        ## Put consecutive numbers in the remainder of the array,
        ## counting up from position-index p.
        for (i = p + 1; i <= r; i++) indices[i] = indices[i - 1] + 1

        ## Print the current combination of items:
        comb = ""
        for (i=1; i <= r; i++) {
            comb = (i>1 ? comb OFS : "") A[indices[i]]
        }
        B[comb]
    }
}

# Input should be a list of strings
{
    split($0,A)
    delete B
    get_combs(A,B)
    PROCINFO["sorted_in"] = "@ind_str_asc"
    for (comb in B) {
        print comb
    }
}

$ awk -f combinations.awk <<< '1 2 3 4'
1 2
1 3
1 4
2 3
2 4
3 4

$ while read -r a b; do
    echo gcta  --reml-bivar "$a" "$b" --grm test  --pheno test.phen  --out test
done < <(awk -f combinations.awk <<< '1 2 3 4')
gcta --reml-bivar 1 2 --grm test --pheno test.phen --out test
gcta --reml-bivar 1 3 --grm test --pheno test.phen --out test
gcta --reml-bivar 1 4 --grm test --pheno test.phen --out test
gcta --reml-bivar 2 3 --grm test --pheno test.phen --out test
gcta --reml-bivar 2 4 --grm test --pheno test.phen --out test
gcta --reml-bivar 3 4 --grm test --pheno test.phen --out test

在完成测试并满意输出后,删除echo

如果有人正在阅读并且想要排列而不是组合:

$ cat permutations.awk
###################
# Calculate all permutations of a set of strings, see
# https://en.wikipedia.org/wiki/Heap%27s_algorithm

function get_perm(A,            i, lgth, sep, str) {
    lgth = length(A)
    for (i=1; i<=lgth; i++) {
        str = str sep A[i]
        sep = " "
    }
    return str
}

function swap(A, x, y,  tmp) {
    tmp  = A[x]
    A[x] = A[y]
    A[y] = tmp
}

function generate(n, A, B,      i) {
    if (n == 1) {
        B[get_perm(A)]
    }
    else {
        for (i=1; i <= n; i++) {
            generate(n - 1, A, B)
            if ((n%2) == 0) {
                swap(A, 1, n)
            }
            else {
                swap(A, i, n)
            }
        }
    }
}

function get_perms(A,B) {
    generate(length(A), A, B)
}

###################

# Input should be a list of strings
{
    split($0,A)
    delete B
    get_perms(A,B)
    PROCINFO["sorted_in"] = "@ind_str_asc"
    for (perm in B) {
        print perm
    }
}

$ awk -f permutations.awk <<< '1 2 3 4'
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1

以上两种方法都使用GNU awk进行sorted_in对输出进行排序。如果您没有GNU awk,您仍然可以按原样使用脚本,并且如果需要对输出进行排序,则将其通过管道传递到sort

答案 3 :(得分:0)

    #!/bin/bash

    #set the length of the combination depending the 
    #user's choice 

    eval rg+=({1..$2})

    #the code builds the script and runs it (eval)

    eval `
    #Character range depending on user selection
    for i in ${rg[@]} ; do
    echo "for c$i in {1..$1} ;do " 
    done ;


    #Since the script is based on a code that brings 
    #all possible combinations even with duplicates - 
    #this is where the deduplication 
    #prevention conditioning set by (the script writes           
    #the conditioning code)


    op1=$2
    op2=$(( $2 - 1 ))
    echo -n "if [ 1 == 1 ] "

    while [ $op1 -gt 1 ]  ; do
    echo -n  \&\& [ '$c'$op1 != '$c'$op2 ]' '
    op2=$(( op2 -1 )
    if [ $op2 == 0 ] ; then  
            op1=$(( op1 - 1 ))
            op2=$(( op1 - 1 ))
    fi
    done ;

    echo  ' ; then'
    echo -n "echo "

    for i in ${rg[@]} ; 
    do
    echo -n '$c'$i
    done ;

    echo \;
    echo fi\;

    for i in ${rg[@]} ; do
    echo 'done ;'
    done;`

    example:               range       length
    $ ./combs.bash '{1..2} {a..c} \$ \#' 4
    12ab$
    12ab#
    12acb
    12ac$
    12ac#
    12a$b
    12a$c
    12a$#
    12a#b
    12a#c
    12a#$
    ..........

答案 4 :(得分:0)

sscanf