如何找到使字符串平衡的最小操作数?

时间:2019-02-02 09:48:30

标签: algorithm dynamic-programming

来自Codechef

  

当且仅当所有字符出现在字符串中的次数相等时,才认为该字符串是 balanced

     

为您提供了一个字符串S;该字符串只能包含大写英文字母。您可以多次执行以下操作(包括零次):在S中选择一个字母,然后用另一个大写英文字母替换。请注意,即使替换的字母多次出现在S中,也仅替换了该字母的所选出现。

     

查找将给定字符串转换为平衡字符串所需的最少操作数。

     

示例:

     

对于输入:ABCB

     

在这里,我们可以将C替换为A,得到:ABAB,其中字符串的每个字符出现两次。

     

因此,最小操作数= {1

如何使字符串变好?

我可以对其应用动态编程吗?

5 个答案:

答案 0 :(得分:12)

我认为您这里真的不需要动态编程。

O (length( S ))时间中的一种方法:

  • 遍历 S ,构建频率图(从不同字母A–Z到计数的映射)。在您的ABCB示例中,将是A->1 B->2 C->1 D->0 E->0 ... Z->0,我们可以将其表示为数组[1, 2, 1, 0, 0, ..., 0]
    • 之所以可以这样做,是因为我们实际上根本不在乎字母的位置; ABCBABBC之间没有真正的区别,因为可以通过用C替换它们的A来平衡它们。
  • 排序数组。对于您的示例,得出[0, 0, ..., 0, 1, 1, 2]
    • 之所以可以这样做,是因为我们实际上并不在乎哪个字母是哪个字母。 ABCBABDB之间没有真正的区别,因为可以通过将一个单字母替换为另一个字母来平衡每个字母。
  • 假设该字符串是非空的(因为如果为空则答案仅为0),则最终的平衡字符串将至少包含1个字符,最多包含26个不同的字母。对于介于1和26之间的每个整数 i ,确定要产生具有 i 个不同字母的平衡字符串,需要执行多少个操作:
    • 首先,检查length( S )是否为 i 的倍数;如果没有,则不可能,因此请跳至下一个整数。
    • 接下来,计算 length( S / i ,即最终平衡字符串中每个不同字母的计数。
    • 要计算需要执行多少操作,我们将检查所有需要增加的计数。 (我们可以等效地检查需要减少的计数:它们必须匹配。)
    • 我们只对排序数组的最后 i 个元素感兴趣。该点之前的任何元素都表示不会在平衡字符串中出现的字母:或者计数已经为零并将保持不变,或者为非零但将减小为零。无论哪种方式,由于我们仅跟踪增加,因此我们可以忽略它们。
    • 对于每个小于 length( S / i i 个计数,添加区别。该总和是操作数。 (请注意,由于对计数进行了排序,因此当您获得的计数大于或等于目标计数时,您可以立即退出此内部循环。)
    • 您可以在第一个 i 大于或等于原始 S 中的不同字母数量(除了 i的值)之后退出此循环我们不得不跳过,因为它们没有平均分配长度( S ))。例如,如果length( S )= 100,并且原始的 S 有19个不同的字母,那么我们只需要考虑 i 为高20岁。(Eric Wang的提示,就是根据这些建议提出建议。)
  • 返回这些总数中最少的26个总数。 (请注意,您实际上并不需要存储所有和;您只需跟踪最小值即可。)

答案 1 :(得分:1)

以下代码使用Java和单元测试来实现该解决方案。

如果不完全相同,该算法与@ruakh的答案几乎相同。


代码

<强> BalanceString.java

import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.Map;

/**
 * Assume string only contains A-Z, the 26 uppercase letter,
 * <p>given a string, you can replace a char with another char from the 26 letter,
 * <p>figure out the minimum replacement required to make the string balance,
 * <p>which means each char in the string occurs the same time,
 *
 * @author eric
 * @date 2/2/19 8:54 PM
 */
public class BalanceString {
    private final char minChar;
    private final char maxChar;
    private final int distinctChars; // total distinct char count,

    public static final BalanceString EN_UPPER_INSTANCE = new BalanceString('A', 'Z');

    public BalanceString(char minChar, char maxChar) {
        this.minChar = minChar;
        this.maxChar = maxChar;
        this.distinctChars = maxChar - minChar + 1;

        if (distinctChars <= 0)
            throw new IllegalArgumentException("invalid range of chars: [" + minChar + ", " + maxChar + "]");
    }

    /**
     * Check minimal moves needed to make string balanced.
     *
     * @param str
     * @return count of moves
     */
    public int balanceCount(String str) {
        // System.out.printf("string to balance:\t%s\n", str);
        int len = str.length(); // get length,
        if (len <= 2) return 0; // corner cases,

        Map<Character, Integer> coMap = figureOccurs(str); // figure occurrences,
        Integer[] occurs = sortOccursReversely(coMap); // reversely order occurrences,

        int m = coMap.size(); // distinct char count,
        int maxN = (len < distinctChars ? len : distinctChars); // get max possible distinct char count, included,

        int smallestMoves = Integer.MAX_VALUE; // smallest moves, among all possible n,

        // check each possible n, and get its moves,
        for (int n = 1; n <= maxN; n++) {
            if (len % n == 0) {
                int moves = figureMoves(len, coMap, occurs, m, n);
                if (moves < smallestMoves) smallestMoves = moves;
            }
        }

        return smallestMoves;
    }

    /**
     * Figure occurs for each char.
     *
     * @param str
     * @return
     */
    protected Map<Character, Integer> figureOccurs(String str) {
        Map<Character, Integer> coMap = new HashMap<>();
        for (char c : str.toCharArray()) {
            if (c < minChar || c > maxChar)
                throw new IllegalArgumentException(c + " is not within range 'A-Z'");

            if (!coMap.containsKey(c)) coMap.put(c, 1);
            else coMap.put(c, coMap.get(c) + 1);
        }

        return coMap;
    }

    /**
     * Get reverse sorted occurrences.
     *
     * @param coMap
     * @return
     */
    protected Integer[] sortOccursReversely(Map<Character, Integer> coMap) {
        Integer[] occurs = new Integer[coMap.size()];

        coMap.values().toArray(occurs);
        Arrays.sort(occurs, Collections.reverseOrder());

        return occurs;
    }

    /**
     * Figure moves needed to balance.
     *
     * @param len   length of string,
     * @param coMap
     * @param m     original distinct elements count,
     * @param n     new distinct elements count,
     * @return count of moves,
     */
    protected int figureMoves(int len, Map<Character, Integer> coMap, Integer[] occurs, int m, int n) {
        int avgOccur = len / n; // average occurrence,
        int moves = 0;

        if (n == m) { // distinct chars don't change,
            for (Integer occur : occurs) {
                if (occur <= avgOccur) break;
                moves += (occur - avgOccur);
            }
        } else if (n < m) { // distinct chars decrease,
            for (int i = 0; i < n; i++) moves += Math.abs(occurs[i] - avgOccur); // for elements kept,
            for (int i = n; i < m; i++) moves += occurs[i]; // for elements to replace,
            moves >>= 1;
        } else { // distinct chars increase,
            for (int i = 0; i < occurs.length; i++) moves += Math.abs(occurs[i] - avgOccur); // for existing elements,
            moves += ((n - m) * avgOccur); // for new elements,
            moves >>= 1;
        }

        return moves;
    }

    public char getMinChar() {
        return minChar;
    }

    public char getMaxChar() {
        return maxChar;
    }

    public int getDistinctChars() {
        return distinctChars;
    }
}

BalanceStringTest.java
(单元测试,通过TestNG

import org.testng.Assert;
import org.testng.annotations.Test;

/**
 * BalanceString test.
 *
 * @author eric
 * @date 2/2/19 9:36 PM
 */
public class BalanceStringTest {
    private BalanceString bs = BalanceString.EN_UPPER_INSTANCE;

    @Test
    public void test() {
        // n < m case,
        Assert.assertEquals(bs.balanceCount("AAAABBBC"), 1); // e.g 1A -> B,
        Assert.assertEquals(bs.balanceCount("AAAAABBC"), 2); // e.g 1A -> B, 1C -> B,
        Assert.assertEquals(bs.balanceCount("AAAAAABC"), 2); // e.g 1B -> A, 1C -> A,
        Assert.assertEquals(bs.balanceCount("AAAAAAAB"), 1); // e.g 1B -> A,

        // n > m case,
        Assert.assertEquals(bs.balanceCount("AAAABBBBCCCCDDDDEEEEAAAA"), 4); // add 1 new char, e.g change 4 A to 4 F,
        Assert.assertEquals(bs.balanceCount(genIncSeq(10)), 15); // A-J, 10 distinct chars, 55 in length; solved by add 1 new char, need 15 steps,

        // n == m case,
        Assert.assertEquals(bs.balanceCount(genIncSeq(3)), 1); // A-C, 3 distinct chars, 6 in length; symmetric, solved with same distinct chars, need 1 steps,
        Assert.assertEquals(bs.balanceCount(genIncSeq(11)), 15); // A-K, 11 distinct chars, 66 in length; symmetric, solved with same distinct chars, need 15 steps,

        // n < m, or n > m case,
        Assert.assertEquals(bs.balanceCount("ABAC"), 1); // e.g 1A -> B, or 1A -> D,
    }

    // corner case,
    @Test
    public void testCorner() {
        // m <= 2,
        Assert.assertEquals(bs.balanceCount(""), 0);
        Assert.assertEquals(bs.balanceCount("A"), 0);
        Assert.assertEquals(bs.balanceCount("AB"), 0);
        Assert.assertEquals(bs.balanceCount("AA"), 0);

        /*------ m == n == distinctChars ------*/
        String mndBalanced = genMndBalancedSeq(); // each possible char occurs exactly once, already balanced,
        Assert.assertEquals(mndBalanced.length(), bs.getDistinctChars());
        Assert.assertEquals(bs.balanceCount(mndBalanced), 0); // no need change,

        char lastChar = mndBalanced.charAt(mndBalanced.length() - 1);
        String mndOneDup = mndBalanced.replace(lastChar, (char) (lastChar - 1)); // (distinctChars -2) chars occur exactly once, one occurs twice, one is missing, thus it's one step away to balance,
        Assert.assertEquals(mndOneDup.length(), bs.getDistinctChars());
        Assert.assertEquals(bs.balanceCount(mndOneDup), 1); // just replace the duplicate char with missing char,
    }

    // invalid input,
    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testInvalidInput() {
        Assert.assertEquals(bs.balanceCount("ABAc"), 1);
    }

    // invalid char range, for constructor,
    @Test(expectedExceptions = IllegalArgumentException.class)
    public void testInvalidRange() {
        new BalanceString('z', 'a');
    }

    /**
     * Generate a string, with first char occur once, second twice, third three times, and so on.
     * <p>e.g A, ABB, ABBCCC, ABBCCCDDDD,
     *
     * @param m distinct char count,
     * @return
     */
    private String genIncSeq(int m) {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < m; i++) {
            for (int j = 0; j <= i; j++) sb.append((char) (bs.getMinChar() + i));
        }
        return sb.toString();
    }

    /**
     * Generate a string that contains each possible char exactly once.
     * <p>For [A-Z], it could be: "ABCDEFGHIJKLMNOPQRSTUVWXYZ".
     *
     * @return
     */
    private String genMndBalancedSeq() {
        StringBuilder sb = new StringBuilder();
        char minChar = bs.getMinChar();
        int distinctChars = bs.getDistinctChars();

        for (int i = 0; i < distinctChars; i++) {
            sb.append((char) (minChar + i));
        }
        return sb.toString();
    }
}

所有测试用例都会通过。


复杂度

  • 时间:O(len) + O(m * lg(m)) + O(m * factorCount)
    • 每个顺序扫描都花费O(len),有几个顺序循环。
    • 对数组进行排序需要O(m*lg(m)),最多为O(distinctChars * lg(distinctChars)),因此是常数,只能排序一次。
    • 要找出每个n的移动量,请花费O(m)
    • 的计数n需要人物移动时,取决于整除数的计数为len个,在范围[minCharmaxChar]。
      这也算是小而恒定。
  • 空格:O(len)
    • 输入字符串需要O(len)
    • 计数器哈希图需要O(m)
    • 排序的出现数组需要O(m)

位置:

  • len是字符串长度。
  • m是原始字符串中不同的字符数
  • distinctChars是不同的字符数,例如26。
  • maxN可能包含的最大不重复字符数,
  • factorCount[1, n]划分的len范围内的可除数计数,
  • minChar分钟炭,e.g A
  • maxChar最大字符,例如Z

并且:

  • len> = m
  • m <= distinctChars

答案 2 :(得分:1)

if __name__ == "__main__":
  for j in range(int(input())):
    S = str(input())
    N = len(S)
    A = [0]*27
    for c in S:
      A[ord(c) - 65] = A[ord(c) - 65] + 1
    A = sorted(A,reverse=True)
    minSwap = N + 1
    for i in range(1,27):
      if N%i == 0:
        temp = N//i
        tempSwap = 0
        for f in range(i):
          if temp > A[f]:
            tempSwap = tempSwap + temp - A[f]
        if tempSwap <= minSwap:
          minSwap = tempSwap
    if minSwap == N+1:
        minSwap = 0
    print(minSwap)

答案 3 :(得分:0)

#include <iostream>
#include <string>
#include <vector>

int countOps(std::vector<int> &map, int requiredFreq){
    int countOps = 0, greaterFreq = 0, lesserFreq = 0;
    for (auto a : map){
        if (a > 0 && a < requiredFreq){
            lesserFreq =  lesserFreq + abs(requiredFreq - a);
        }
        else if (a > 0 && a > requiredFreq){
            greaterFreq =  greaterFreq + abs(requiredFreq - a);
        }
    }

    countOps = greaterFreq > lesserFreq ? (lesserFreq + (greaterFreq - lesserFreq)) : lesserFreq;

    return countOps;
}

int balanceString(std::string &s, long n){

    std::vector<int> map(26, 0);
    int distinctChar = 0;
    int requiredFreq = -1;
    int count = INT_MAX;

    // Creating map with frequency and counting distinct chars
    for (char x : s){
        if (map[x - 'a'] == 0) distinctChar++;
        map[x - 'a']++;
    }

    std::sort(std::begin(map), std::end(map), std::greater<int>{});

    // If size is multiple of distinctChar
    if (n % distinctChar == 0){
        requiredFreq = int(n / distinctChar);
        count = countOps(map, requiredFreq);
    }
    else{
        for (int i = 1; i < distinctChar;  i++){
            if (n % i == 0){
                requiredFreq = int(n / i);

                std::vector<int> temp(map.begin(), map.begin() + i);
                int x = countOps(temp, requiredFreq);
                count = std::min(count, x);
            }
        }
    }

    return count;
}

int main(){
    std::string s = "aaaaccddefghiii";
    long n = s.size();

    if(n <= 1) return 0;

    int count = balanceString(s, n);

    std::cout << count << std::endl;

    return 0;
}

答案 4 :(得分:0)

要解决此问题,我认为找出字符串中不同元素的所有额外存在(表示一次以上的元素数量之和)也是很有用的

  

例如:在aabbc中,为使每个元素的存在等于2(this is called good string),我们必须删除的元素数

`x=input()
char=26
total=0
lis=[0]*char
#print(lis)

for i in range(len(x)):
    lis[ord(x[i])-ord('a')]+=1
#print(lis) 

for i in range(26):
    if(lis[i]>1):
        total=total+(lis[i]-1)
print(total)        

`