The Bug

http://rextester.com/DUJZN22339

如果字符串没有类似的suffix并且书籍标题都以相同的字符结尾，则该字符最终会被删除。在这种情况下s。

这本名为蝇王的书这本名为“呼啸山庄”的书这本名为Great Expectations的书

过滤

Flie之王呼啸山庄期待很高

C Sharp

注意：这是一个示例列表，我在不同于书名的字符串上使用它。

public static void Main(string[] args)
{
    var sentences = new List<string>() 
    { 
        "The book named Lord of the Flies",
        "The book named Wuthering Heights",
        "The book named Great Expectations"
    };

    var titles = ExtractDifferences(sentences);

    Console.WriteLine(string.Join("\n", titles));

}

static List<string> ExtractDifferences(List<string> sentences)
{
    var firstDiffIndex = GetFirstDifferenceIndex(sentences);
    var lastDiffIndex = GetFirstDifferenceIndex(sentences.Select(s => new string(s.Reverse().ToArray())).ToList());
    return sentences.Select(s => s.Substring(firstDiffIndex, s.Length - lastDiffIndex - firstDiffIndex)).ToList();
}


static int GetFirstDifferenceIndex(IList<string> strings)
{
    int firstDifferenceIndex = int.MaxValue;

    for (int i = 0; i < strings.Count; i++)
    {
        var current = strings[i];
        var prev = strings[i == 0 ? strings.Count - 1 : i - 1];

        var firstDiffIndex = current
            .Select((c, j) => new { CurrentChar = c, Index = j })
            .FirstOrDefault(ci => ci.CurrentChar != prev[ci.Index])
            .Index;

        if (firstDiffIndex < firstDifferenceIndex)
        {
            firstDifferenceIndex = firstDiffIndex;
        }
    }
    return firstDifferenceIndex;
}

Answer 1

您可以通过回溯到最近的单词边界来处理删除单词部分的问题。在这里，我只是假设这是一个空间，但如果需要，您可能希望扩展它。

在处理带有常用词的书名时，首先想到的是假设它们将被大写。所以除了句子的第一个字母外，你还可以停在第一个作为大写字符的字符。

此外，您可以通过不比较第一个算法来改进当前算法。只比较第1和第2，然后是第2和第3，依此类推到倒数第二个，最后一个就足够了。如果它确定差异的开始为零，则可以立即返回。

static int GetFirstDifferenceIndex(IList<string> strings)
{
    int firstDifferenceIndex = int.MaxValue;

    for (int i = 1; i < strings.Count; i++)
    {
        var current = strings[i];
        var prev = strings[i - 1];

        // Index of first character that is different or that is a capital letter
        // other than the first character of the sentence.
        var firstDiffIndex = current
            .Select((c, j) => new { CurrentChar = c, Index = j })
            .FirstOrDefault(ci => ci.CurrentChar != prev[ci.Index]
                            || (ci.Index != 0 && char.IsUpper(ci.CurrentChar)))
            .Index;

        // back track to the beginning or until the previous char is a space
        while(firstDiffIndex > 0 && current[firstDiffIndex-1] != ' ')
        {
            firstDiffIndex--;
        }

        if(firstDiffIndex == 0) return 0;

        if (firstDiffIndex < firstDifferenceIndex)
        {
            firstDifferenceIndex = firstDiffIndex;
        }
    }
    return firstDifferenceIndex;
}

这将采用句子

这本名为“指环王”的书

这本名为“蝇王”的书

和输出

指环王

蝇王

由于后退跟踪

，当你翻转句子时，它也适用于具有共同结尾的书名

这本名为The Old Man and The Sea的书是经典的

这本名为Alone on a Wide，Wide Sea的书是经典的

将导致

老人与海

独自在宽阔的海面上

但是，当然这依赖于书籍标题的第一个和最后一个单词以大写字母开头，只有前缀的第一个字符是大写字母（并且没有以大写字母开头的后缀）。为了处理可能失败的情况，您必须开始分析会导致非常复杂算法的词性。

Answer 2

假设：只有当Prefix（名为书）和后缀（是经典的）出现时 - 书名才会成为输出的一部分。例如：

这本名为蝇王的书是经典之作。 - ＆GT;将通过 - 预订姓名：蝇王勋爵
“蝇王”一书是经典之作。 - ＆GT;不会通过
这本名为“蝇王”的书很经典。 - ＆GT;不会通过

如果以上是正确的 - 为什么不使用正则表达式 - 为模式匹配目的而构建：

[self.navigationController.navigationBar sizeToFit];

此正则表达式将为您找到书名（请记住使用Ignore Case Regex选项）。见截图：

提取相似前缀和后缀Bug之间的标题

The Bug

C Sharp

2 个答案: