对数组进行混洗,使得没有两个相同的元素相邻

时间:2016-10-20 10:32:11

标签: bash

我有一个包含字符串的数组。其中几个字符串可以是相同的,也没关系。它们可以按任何顺序开始,但很可能是按字母顺序排列。我有以下shuffle函数,它将洗牌所有元素。但是,我想添加一个条件,即数组中没有两个相同的字符串可以相邻。

例如,这很好:ook eek ook monkey ook但不是:ook ook eek ook monkey因为两个ook相邻。假设已经检查了输入,使得任何重复小于元素总数的一半,因此存在一组非相邻解。例如,ook ook ook eek将被拒绝。字符串可以包含空格和UTF-8字符,但不包含新行 - 字符串实际上是图像的文件名。

如何修改shuffle功能以实现此目标?

或者有更好的方法吗?

shuffle() {
    # This function shuffles the elements of an array in-place using the
    # Knuth-Fisher-Yates shuffle algorithm. 
    local i tmp size max rand

    # $RANDOM % (i+1) is biased because of the limited range of $RANDOM
    # Compensate by using a range which is a multiple of the array size.
    size=${#array[*]}
    max=$(( 32768 / size * size ))
    for ((i=size-1; i>0; i--)); do
        while (( (rand=$RANDOM) >= max )); do :; done
            rand=$(( rand % (i+1) ))
            tmp=${array[i]} array[i]=${array[rand]} array[rand]=$tmp
    done
}

1 个答案:

答案 0 :(得分:3)

Given the rejection pre-condition, it is possible to split the word list into several «equivalence classes» (ECs) — special word groups in each of which words are the same by whatever criterion. Rejection implies that there is no more than one EC that is partly in one half of the list and partly in the other.

We lay a part of this EC aside, so that (1) the remaining part is contained in no more than one of the remaining halves of the list, and (2) the halves are strictly equal in size. Then we shuffle the halves, each one separately. After that, we merge them, the first half occupying odd elements, while evens are for the second half. Then we randomly insert the remaining elements previously laid-aside. It is quite simple, since all of them belong to one EC and thus it is easy to mark places where they can be and where they cannot.

By construction, there can be no two adjacent elements belonging to one EC.

[EDIT:] Finally, the implementation of what`s above.

shuffle() {
    local sort="$(sort <<<"$1" | sed "s/^/+/g")"
    local size="$(grep -c ^ <<<"$sort")"
    local take cntr head tail

    if [ "$sort" == "${sort%%$'\n'*}" ]; then
        # single line: nothing to shuffle
        echo "${sort#+}"
        return
    fi
    if [ $[size & 1] == 1 ]; then
        # enforcing equal halves, beginning to extract the middle
        take="$(head -$[size / 2 + 1] <<<"$sort" | tail -1)"
    fi
    cntr="$take"
    size=$[size / 2]
    head="$(head -$size <<<"$sort")"
    tail="$(tail -$size <<<"$sort")"
    while [ "$(head -1 <<<"$tail")" == "$(tail -1 <<<"$head")" ]; do
        # continue extracting the middle, if still left
        if [ -n "$cntr" -a "$cntr" != "$(tail -1 <<<"$head")" ]; then
            break
        else
            cntr="$(tail -1 <<<"$head")"
        fi
        ((--size))
        head="$(head -$size <<<"$head")"
        tail="$(tail -$size <<<"$tail")"
        take="${take:+$take$'\n'}$cntr"$'\n'"$cntr"
    done
    sort=()
    for cntr in $(seq $size); do
        # transforming two line sets into a single interlaced array
        sort[$[cntr * 4 - 3]]="$(head -$cntr <<<"$head" | tail -1)"
        sort[$[cntr * 4 - 1]]="$(head -$cntr <<<"$tail" | tail -1)"
    done
    for cntr in $(seq $[size - 1]); do
        # shuffling each of the interlaced halves by Fisher-Yates
        head="${sort[$[cntr * 4 - 3]]}"
        tail=$[RANDOM % (size - cntr + 1) + cntr]
        sort[$[cntr * 4 - 3]]="${sort[$[tail * 4 - 3]]}"
        sort[$[tail * 4 - 3]]="$head"
        head="${sort[$[cntr * 4 - 1]]}"
        tail=$[RANDOM % (size - cntr + 1) + cntr]
        sort[$[cntr * 4 - 1]]="${sort[$[tail * 4 - 1]]}"
        sort[$[tail * 4 - 1]]="$head"
    done
    if [ -n "$take" ]; then
        # got a remainder; inserting
        tail=($(seq 0 $[size * 2]))
        for cntr in $(seq $[size * 2]); do
            # discarding potential places with remainder in proximity
            if [ "${sort[$[cntr * 2 - 1]]}" \
              == "${take%%$'\n'*}" ]; then
                tail[$[cntr - 1]]=""
                tail[$[cntr]]=""
            fi
        done
        tail=(${tail[@]})
        for cntr in $(seq 0 $[${#tail[@]} - 2]); do
            # shuffling the remaining places, also by Fisher-Yates
            head="${tail[$cntr]}"
            size=$[RANDOM % (${#tail[@]} - cntr) + cntr]
            tail[$cntr]="${tail[$size]}"
            tail[$size]="$head"
        done
        size="$(grep -c ^ <<<"$take")"
        while ((size--)); do
            # finally inserting remainders
            sort[$[${tail[$size]} * 2]]="${take%%$'\n'*}"
        done
    fi
    head=0
    size="${#sort[@]}"
    while ((size)); do
        # printing the overall result
        if [ -n "${sort[$head]}" ]; then
            echo "${sort[$head]#+}"
            ((size--))
        fi
        ((head++))
    done
}

# a very simple test from the original post
shuffle \
"ook
ook
eek
ook
monkey"