Question

我们正在尝试在Delphi（10 Seattle）中为我们的Firebird 2.5数据库编写一个UDF，它应该从输入字符串中删除一些字符。数据库中的所有字符串字段都使用字符集UTF8和排序规则UNICODE_CI_AI。

该函数应删除一些字符，如空格，。 ; ：/ \和字符串中的其他人。我们的函数适用于包含ascii值＆lt; = 127的字符的字符串。只要ascii值大于127的字符，UDF就会失败。我们尝试使用PChar而不是PAnsiChar参数，但没有成功。现在我们检查字符的ascii值是否大于127，如果是，我们也会从字符串中删除该字符。

我们想要的是一个UDF，它返回没有标点字符的原始字符串。

到目前为止，这是我们的代码：

    unit UDFs;

    interface

    uses ib_util;

    function UDF_RemovePunctuations(InputString: PAnsiChar): PAnsiChar; cdecl;

    implementation

    uses SysUtils, AnsiStrings, Classes;

    //FireBird declaration:
    //DECLARE EXTERNAL FUNCTION UDF_REMOVEPUNCTUATIONS
    //  CSTRING(500)
    //RETURNS CSTRING(500) FREE_IT
    //ENTRY_POINT 'UDF_RemovePunctuations' MODULE_NAME 'FB_UDF.dll';
    function UDF_RemovePunctuations(InputString: PAnsiChar): PAnsiChar;
    const
      PunctuationChars = [' ', ',', '.', ';', '/', '\', '''', '"','(', ')'];
    var
      I: Integer;
      S, NewS: String;
    begin
      S := UTF8ToUnicodeString(InputString);

      For I := 1 to Length(S) do
      begin
        If Not CharInSet(S[I], PunctuationChars)
        then begin
          If S[I] <= #127
          then NewS := NewS + S[I];
        end;
      end;

      Result := ib_util_malloc(Length(NewS) + 1);
      NewS := NewS + #0;
      AnsiStrings.StrPCopy(Result, NewS);
    end;

    end.

当我们删除对ascii值＆lt; =＃127的检查时，我们可以看到NewS包含它应该是的所有字符（当然没有标点字符）但是在我们想到的StrPCopy中出现问题。

任何帮助将不胜感激！

Answer 1

感谢LU RD我的工作。

答案是将我的字符串变量声明为Utf8String而不是String，而不是将输入字符串转换为Unicode。

我已经调整了我的代码：

    //FireBird declaration:
    //DECLARE EXTERNAL FUNCTION UDF_REMOVEPUNCTUATIONS
    //  CSTRING(500)
    //RETURNS CSTRING(500) FREE_IT
    //ENTRY_POINT 'UDF_RemovePunctuations' MODULE_NAME 'CarfacPlus_UDF.dll';
    function UDF_RemovePunctuations(InputString: PAnsiChar): PAnsiChar;
    const
      PunctuationChars = [' ', ',', '.', ';', '/', '\', '''', '"','(', ')', '-',
                          '+', ':', '<', '>', '=', '[', ']', '{', '}'];
    var
      I: Integer;
      S: Utf8String;
    begin
      S := InputString;

      For I := Length(S) downto 1 do
        If CharInSet(S[I], PunctuationChars)
        then Delete(S, I, 1);

      Result := ib_util_malloc(Length(S) + 1);
      AnsiStrings.StrPCopy(Result, AnsiString(S));
    end;

带有UTF8字符串的Delphi Firebird UDF

1 个答案: