为什么在更改文件属性时会丢失UTF-8 BOM?

时间:2012-10-15 19:16:43

标签: delphi utf-8 byte-order-mark file-attributes

我有一个测试Delphi应用程序,它使用TFileStream将UTF-8 BOM写入文本文件,然后是一行虚拟文本。

所有工作都按预期工作,并使用Notepad ++的hex查看器插件我在输出文本文件中看到BOM。但是,如果我在重新打开文件时更改了文本文件的属性(在Delphi中或通过Windows资源管理器进行编程),则BOM已被删除。

将BOM和虚拟数据写入文件的示例代码:

procedure TForm1.Button1Click(Sender: TObject);
const
  cFilename = 'myfile.txt';
var
  fs : TFileStream;
  gBOM : TBytes;
  gStr : RawByteString;
begin
  fs := TFileStream.Create(cFilename, fmCreate, fmShareDenyWrite);
  try
    gBOM := TEncoding.UTF8.GetPreamble;
    fs.WriteBuffer(PAnsiChar(gBOM)^, Length(gBOM));

    // Dummy data
    gStr := UTF8Encode('Dummy string') + AnsiChar(#13) + AnsiChar(#10);
    fs.WriteBuffer(PAnsiChar(gStr)^, Length(gStr));

    // If you read the file now the BOM will be present, however
    // the follow line appears to remove it.
    FileSetAttr(cFilename, faReadOnly);

  finally
    FreeAndNil(fs);
  end;
end;

1 个答案:

答案 0 :(得分:4)

设置文件属性不会影响文件的现有内容。 BOM可以消失的唯一方法是将文件的内容复制到省略BOM的新文件。设置属性不会这样做。

请记住,您正在使用相对文件路径,因此可能您的计算机上有多个文件副本,并且查看错误的文件。始终使用完整路径。

使用TEncoding将BOM和文本写入文件的更简单方法是使用TStreamWriter类。

关闭文件后应该调用FileSetAttr()以确保它实际生效,并且在调用FileGetAttr()之前需要调用FileSetAttr()以确保正确保留现有属性。

请改为尝试:

procedure TForm1.Button1Click(Sender: TObject); 
const 
  cFilename = 'c:\path to\myfile.txt'; 
var 
  sw : TStreamWriter;
  Attrs: Integer; 
begin 
  sw := TStreamWriter.Create(cFilename, False, TEncoding.UTF8); 
  try 
    sw.WriteLine('Dummy string');
  finally 
    sw.Free; 
  end; 
  Attrs := FileGetAttr(cFilename);
  if Attrs <> -1 then 
    FileSetAttr(cFilename, Attrs or faReadOnly); 
end; 

可替换地:

// GetFileInformationByHandle() is declared in Windows.pas, but SetFileInformationByHandle() is not!

type
  _FILE_INFO_BY_HANDLE_CLASS = ( 
    FileBasicInfo,
    FileStandardInfo,
    FileNameInfo,
    FileRenameInfo,
    FileDispositionInfo,
    FileAllocationInfo,
    FileEndOfFileInfo,
    FileStreamInfo,
    FileCompressionInfo,
    FileAttributeTagInfo,
    FileIdBothDirectoryInfo
);
FILE_INFO_BY_HANDLE_CLASS = _FILE_INFO_BY_HANDLE_CLASS;

_FILE_BASIC_INFO = record
  CreationTime: LARGE_INTEGER;
  LastAccessTime: LARGE_INTEGER;
  LastWriteTime: LARGE_INTEGER;
  ChangeTime: LARGE_INTEGER;
  FileAttributes: DWORD;
end;
FILE_BASIC_INFO = _FILE_BASIC_INFO;

function SetFileInformationByHandle(hFile: THandle; FileInformationClass: FILE_INFO_BY_HANDLE_CLASS; lpFileInformation: Pointer; dwBufferSize: DWORD): BOOL; stdcall; external 'kernel32' delayed;

procedure TForm1.Button1Click(Sender: TObject); 
const 
  cFilename = 'c:\path to\myfile.txt'; 
var 
  sw : TStreamWriter;
  fi: TByHandleFileInformation;
  bi: FILE_BASIC_INFO;
  Attrs: Integer;
  AttrsSet: Boolean;
begin 
  AttrsSet := False;

  sw := TStreamWriter.Create(cFilename, False, TEncoding.UTF8); 
  try 
    sw.WriteLine('Dummy string');

    if CheckWin32Version(6, 0) then
    begin
      if GetFileInformationByHandle(TFileStream(sw.BaseStream).Handle, fi) then
      begin
        bi.CreationTime.LowPart := fi.ftCreationTime.dwLowDateTime;
        bi.CreationTime.HighPart := fi.ftCreationTime.dwHighDateTime;

        bi.LastAccessTime.LowPart := fi.ftLastAccessTime.dwLowDateTime;
        bi.LastAccessTime.HighPart := fi.ftLastAccessTime.dwHighDateTime;

        bi.LastWriteTime.LowPart := fi.ftLastWriteTime.dwLowDateTime;
        bi.LastWriteTime.HighPart := fi.ftLastWriteTime.dwHighDateTime;

        bi.ChangeTime := bi.LastWriteTime;

        bi.FileAttributes := fi.dwFileAttributes or FILE_ATTRIBUTE_READONLY;
        AttrsSet := SetFileInformationByHandle(TFileStream(sw.BaseStream).Handle, FileBasicInfo, @bi, SizeOf(bi));
      end;
  finally 
    sw.Free; 
  end; 

  if not AttrsSet then
  begin
    Attrs := FileGetAttr(cFilename);
    if Attrs <> -1 then 
      FileSetAttr(cFilename, Attrs or faReadOnly); 
  end;
end;