如何计算两个字符串之间的差异?

时间:2012-05-14 08:58:44

标签: string pascal delphi

我想在Delphi中创建一个函数来计算两个字符串的不同级别。如果两个字符串相等(忽略大小写),则它应返回0,但如果它们不相等,则应返回不同字符的数量。此功能对于检查拼写非常有用。

function GetDiffStringLevel(S1,S2:string):Integer;
begin
  if SameText(S1,S2) then Exit(0);
  // i want get different chars count
end

示例代码:

Diff:=GetDiffStringLevel('Hello','Hello');// Diff:=0;
Diff:=GetDiffStringLevel('Hello','2Hello');// Diff:=1;
Diff:=GetDiffStringLevel('Hello','H2ello');// Diff:=1;
Diff:=GetDiffStringLevel('Hello','Hello W');// Diff:=2;
Diff:=GetDiffStringLevel('Hello','World');// Diff:=6; or 5

2 个答案:

答案 0 :(得分:13)

快速紧凑的实施。

使用正常字符串执行smasher的速度大约是其速度的3倍。 如果其中一个字符串为空,则速度超过100倍。

Smasher的功能虽然不区分大小写,但也很有用。

function LevenshteinDistance(const s, t: string): integer;inline;
var
  d   : array of array of integer;
  n, m, i, j : integer;
begin
  n := length(s);
  m := length(t);
  if n = 0 then Exit(m);
  if m = 0 then Exit(n);

  SetLength(d, n + 1, m + 1);
  for i := 0 to n do d[i, 0] := i;
  for j := 0 to m do d[0, j] := j;

  for i := 1 to n do
    for j := 1 to m do
      d[i, j] := Min(Min(d[i-1, j]+1, d[i,j-1]+1), d[i-1,j-1]+Integer(s[i] <> t[j]));

  Result := d[n, m];
end;

注意:inline指令将我的计算机上的执行时间减少到不到70%,但仅限于win32目标平台。如果编译为64位(Delphi XE2),内联实际上会使它稍微慢一些。

答案 1 :(得分:8)

您想要的是Levenshtein distance(将一个字符串转换为另一个字符串的最小编辑次数,其中编辑是字符插入,字符删除或字符替换)。维基百科站点具有伪代码实现。

德尔福实施:

function LevenshteinDistance(String1 : String; String2 : String) : Integer;

var
  Length1, Length2      : Integer;
  WorkMatrix            : array of array of Integer;
  I, J                  : Integer;
  Cost                  : Integer;
  Val1, Val2, Val3      : Integer;

begin
String1 := TCharacter.ToUpper (String1);
String2 := TCharacter.ToUpper (String2);
Length1 := Length (String1);
Length2 := Length (String2);
SetLength (WorkMatrix, Length1+1, Length2+1);
for I := 0 to Length1 do
  WorkMatrix [I, 0] := I;
for J := 0 to Length2 do
  WorkMatrix [0, J] := J;
for I := 1 to Length1 do
  for J := 1 to Length2 do
    begin
    if (String1 [I] = String2 [J]) then
      Cost := 0
    else
      Cost := 1;
    Val1 := WorkMatrix [I-1, J] + 1;
    Val2 := WorkMatrix [I, J-1] + 1;
    Val3 := WorkMatrix[I-1, J-1] + Cost;
    if (Val1 < Val2) then
      if (Val1 < Val3) then
        WorkMatrix [I, J] := Val1
      else
        WorkMatrix [I, J] := Val3
    else
      if (Val2 < Val3) then
        WorkMatrix [I, J] := Val2
      else
        WorkMatrix [I, J] := Val3;
    end;
Result := WorkMatrix [Length1, Length2];
end;