数组变异算法

时间:2013-01-04 19:43:27

标签: arrays algorithm sorting diff

我正在寻找以下问题的解决方案:

给定数组“a”和数组“b”,找到一组操作,当应用于“a”时,将“a”转换为“b”。

所以,例如,鉴于我有:

a = [1,2,3]
b = [3,2,4]
c = transmute(a, b)

我现在希望c包含类似的内容:

[["remove", 0], ["add", 2, 4], ["swap", 0, 1]]

按照给定的顺序在“a”上添加这些操作应该产生“b”:

[1,2,3] => [2,3] => [2,3,4] => [3,2,4]

这是Ruby中一个非常天真的实现:https://gist.github.com/4455256。这假设数组中没有重复(这不是一个好的假设)。我认为它也是O(n²),如果有更高性能的话会更好。

有没有已知的算法可以做到这一点?我可以做进一步的阅读吗?您对如何改进这项建议有什么建议吗?

3 个答案:

答案 0 :(得分:2)

您可以分阶段进行来解决此问题。根据我的想法,应该有3个阶段。 O(N)解决方案是可能的。

将数组A复制到数组C,将数组B复制到数组D中。

  1. 现在,比较两个阵列C& D. 删除C中不存在的元素。这些是需要从数组A中删除的元素。 如果我们在此步骤中使用HashMap,则可以在O(N)
  2. 中完成
  3. 再次比较C和D,删除D中不在C中的元素。这些元素基本上是我们添加到数组中的元素 A.
  4. 所以,现在我们有2个阵列 - C& D基本上具有相同的元素。我们只是需要了解我们如何交换这些元素以使它们看起来相似
  5. 一旦看起来相似,我们就可以将缺少的元素从A添加到D 中。将您在步骤2中删除的元素添加回数组D.通过与原始数组A进行比较,可以按正确的顺序添加它们。
  6. 因为,步骤1,2,4非常简单。我将解释如何接受交换顺序。 让我们举个例子。如果目前我们的数组C看起来像1,3,2而D看起来像3,2,1。 我们将D中每个索引的值与C中的对应值进行比较。如果它们不同,那么我们标记从C中的元素到D中的元素的有向边。 因此,在索引0处,C具有1并且D具有3.它们是不同的,因此我们绘制从1到3的有向边.1-> 3。 类似地,我们为索引1绘制从3到2的边。 索引2的边缘从2到1。

    这导致我们 DAG 。你可以尝试各种各样的东西,比如DFS,看看,我只是在这里陈述结果。没有。交换将是(图中的节点数 - 1)图表的DFS遍历将告知掉期的发生顺序

    注意:如果数组中有重复的元素,那么需要更多的簿记,但同样的解决方案将起作用。

    如果您无法通过DAG交换算法的阶段。请查看@handcraftsman引用的问题,即string transposition algorithm。它针对同一问题提出了类似的方法。

答案 1 :(得分:1)

这是一个解决方案:

get the token-index pairs of all tokens in the source string
get the token-index pairs of all tokens in the target string

from both sets remove the values that are in the other set.

foreach token-index in the source set
   if the target set has the token at the same location
      remove it from both sets, this is a match created by a previous swap
   get the target token at the source index
   if the source set has the target token (at any index)
      create a swap command to swap the source token at source index with
         the target token at its index in the source
      remove the token-index from the source set
      remove the target token-index from the target set
      add a token-index for the target token at the new index to the source set
      loop without moving to the next token-index

create remove commands for any remaining token-indexes in the source set
create add commands for any remaining token-indexes in the target set

这是一个快速的C#实现:

private static IEnumerable<ICommand> GetChangeCommands(string source, string target)
{
    var unmatchedSourceTokens = GetUnmatchedTokenIndexes(source, target);
    var unmatchedTargetTokens = GetUnmatchedTokenIndexes(target, source);

    var commands = new List<ICommand>();

    foreach (var tokenIndexList in unmatchedSourceTokens)
    {
        var sourceToken = tokenIndexList.Key;
        var sourceStringSourceTokenIndexes = unmatchedSourceTokens[sourceToken];

        foreach (var sourceLoopIndex in tokenIndexList.Value.ToList())
        {
            var sourceIndex = sourceLoopIndex;
            bool swapped;
            do
            {
                swapped = false;
                if (sourceIndex >= target.Length)
                {
                    continue;
                }
                var targetToken = target[sourceIndex];
                if (targetToken == sourceToken)
                {
                    sourceStringSourceTokenIndexes.Remove(sourceIndex);
                    unmatchedTargetTokens[targetToken].Remove(sourceIndex);
                    continue;
                }
                List<int> sourceStringTargetTokenIndexes;
                if (!unmatchedSourceTokens.TryGetValue(targetToken, out sourceStringTargetTokenIndexes) ||
                    !sourceStringTargetTokenIndexes.Any())
                {
                    continue;
                }
                var targetIndex = sourceStringTargetTokenIndexes.First();
                commands.Add(new SwapCommand(sourceIndex, targetIndex));
                sourceStringTargetTokenIndexes.RemoveAt(0);
                sourceStringSourceTokenIndexes.Remove(sourceIndex);
                sourceStringSourceTokenIndexes.Add(targetIndex);
                unmatchedTargetTokens[targetToken].Remove(sourceIndex);
                swapped = true;
                sourceIndex = targetIndex;
            } while (swapped);
        }
    }

    var removalCommands = unmatchedSourceTokens
        .SelectMany(x => x.Value)
        .Select(x => new RemoveCommand(x))
        .Cast<ICommand>()
        .OrderByDescending(x => x.Index)
        .ToList();

    commands.AddRange(removalCommands);

    var insertCommands = unmatchedTargetTokens
        .SelectMany(x => x.Value.Select(y => new InsertCommand(y, x.Key)))
        .Cast<ICommand>()
        .OrderBy(x => x.Index)
        .ToList();

    commands.AddRange(insertCommands);

    return commands;
}

private static IDictionary<char, List<int>> GetUnmatchedTokenIndexes(string source, string target)
{
    var targetTokenIndexes = target.Select((x, i) => new
                                                            {
                                                                Token = x,
                                                                Index = i
                                                            })
                                    .ToLookup(x => x.Token, x => x.Index)
                                    .ToDictionary(x => x.Key, x => x.ToList());

    var distinctSourceTokenIndexes = new Dictionary<char, List<int>>();
    foreach (var tokenInfo in source.Select((x, i) => new
                                                            {
                                                                Token = x,
                                                                Index = i
                                                            }))
    {
        List<int> indexes;
        if (!targetTokenIndexes.TryGetValue(tokenInfo.Token, out indexes) ||
            !indexes.Contains(tokenInfo.Index))
        {
            if (!distinctSourceTokenIndexes.TryGetValue(tokenInfo.Token, out indexes))
            {
                indexes = new List<int>();
                distinctSourceTokenIndexes.Add(tokenInfo.Token, indexes);
            }
            indexes.Add(tokenInfo.Index);
        }
    }
    return distinctSourceTokenIndexes;
}

internal class InsertCommand : ICommand
{
    private readonly char _token;

    public InsertCommand(int index, char token)
    {
        Index = index;
        _token = token;
    }

    public int Index { get; private set; }

    public string Change(string input)
    {
        var chars = input.ToList();
        chars.Insert(Index, _token);
        return new string(chars.ToArray());
    }

    public override string ToString()
    {
        return string.Format("[\"add\", {0}, '{1}']", Index, _token);
    }
}

internal class RemoveCommand : ICommand
{
    public RemoveCommand(int index)
    {
        Index = index;
    }

    public int Index { get; private set; }

    public string Change(string input)
    {
        var chars = input.ToList();
        chars.RemoveAt(Index);
        return new string(chars.ToArray());
    }

    public override string ToString()
    {
        return string.Format("[\"remove\", {0}]", Index);
    }
}

internal class SwapCommand : ICommand
{
    private readonly int _targetIndex;

    public SwapCommand(int sourceIndex, int targetIndex)
    {
        Index = sourceIndex;
        _targetIndex = targetIndex;
    }

    public int Index { get; private set; }

    public string Change(string input)
    {
        var chars = input.ToArray();
        var temp = chars[Index];
        chars[Index] = chars[_targetIndex];
        chars[_targetIndex] = temp;
        return new string(chars);
    }

    public override string ToString()
    {
        return string.Format("[\"swap\", {0}, {1}]", Index, _targetIndex);
    }
}

internal interface ICommand
{
    int Index { get; }
    string Change(string input);
}

样本用法:

const string source = "123";
const string target = "324";
var commands = GetChangeCommands(source, target);
Execute(source, target, commands);

private static void Execute(string current, string target, IEnumerable<ICommand> commands)
{
    Console.WriteLine("converting".PadRight(19) + current + " to " + target);
    foreach (var command in commands)
    {
        Console.Write(command.ToString().PadRight(15));
        Console.Write(" => ");
        current = command.Change(current);
        Console.WriteLine(current);
    }
}

示例输出:

converting         123 to 324
["swap", 0, 2]  => 321
["remove", 2]   => 32
["add", 2, '4'] => 324

converting         hello to world
["swap", 1, 4]  => holle
["remove", 4]   => holl
["remove", 2]   => hol
["remove", 0]   => ol
["add", 0, 'w'] => wol
["add", 2, 'r'] => worl
["add", 4, 'd'] => world

converting         something to smith
["swap", 1, 2]  => smoething
["swap", 2, 6]  => smiethong
["swap", 3, 4]  => smitehong
["swap", 4, 5]  => smitheong
["remove", 8]   => smitheon
["remove", 7]   => smitheo
["remove", 6]   => smithe
["remove", 5]   => smith

converting         something to sightseeing
["swap", 1, 6]  => simethong
["swap", 6, 3]  => simotheng
["swap", 3, 5]  => simhtoeng
["swap", 2, 8]  => sightoenm
["remove", 8]   => sightoen
["remove", 7]   => sightoe
["remove", 5]   => sighte
["add", 5, 's'] => sightse
["add", 7, 'e'] => sightsee
["add", 8, 'i'] => sightseei
["add", 9, 'n'] => sightseein
["add", 10, 'g'] => sightseeing

以上示例中存在一些效率低下的问题:   它交换了将被删除的令牌   它删除然后重新添加令牌

答案 2 :(得分:0)

这个答案可能有用: string transposition algorithm

看看编辑距离算法