文本比较算法或程序?

时间:2017-07-18 12:01:57

标签: java algorithm

我有两段有句子的段落,我想比较两段,并想在UI上显示差异。

以下是我可以考虑的可能用例。任何算法或代码方面的帮助都会很明显。

enter image description here

案例1:从str2中删除了Word

String str1 = "Hello I am new How are you";
String str2 = "How are you Hello";

output :
str1 = "<del>Hello I am new</del> How are you";
str2 = "How are you <add>Hello</add>"

案例2:Word添加到str2

String str1 = "Hello How are you what about you";
String str2 = "How are you I am fine what about you";

output :
str1 = "<del>Hello</del> How are you what about you";
str2 = "How are you <add>I am fine</add> what about you"

案例3:单词相等

    String str1 = "Hello How are you";
    String str2 = "Hello How rea you";

    output :
    str1 = "Hello How <missmatch>are</missmatch> you";
    str2 = "Hello How <missmatch>rea</missmatch> you"

1 个答案:

答案 0 :(得分:1)

你可以,例如请查看:https://github.com/wumpz/java-diff-utils及其示例https://github.com/wumpz/java-diff-utils/wiki/Examples。修改包含您的特定标记而不是标记字符很容易:例如

DiffRowGenerator generator = DiffRowGenerator.create()
                .showInlineDiffs(true)
                .mergeOriginalRevised(true)
                .inlineDiffByWord(true)
                .newTag(f -> f?"<span style=\"background-color:#ffc6c6\">":"</span>")
                .oldTag(f -> f?"<span style=\"background-color:#c4ffc3\">":"</span>")
                .columnWidth(10000000)
                .build();

List<DiffRow> rows = generator.generateDiffRows(
                Arrays.asList(lines.get(0)),
                Arrays.asList(lines.get(1)));

System.out.println(rows.get(0).getOldLine());