处理lua文件路径中的特殊字符(变音符号)

时间:2016-04-19 11:28:46

标签: lua windows-server-2008-r2 lua-5.1

我有一个小的lua函数来检查文件是否存在

function file_exists( filePath )
    local handler = io.open( filePath )
    if handler then
        io.close( handler )
        return true
    end
    return false
end

但是,当文件路径包含特殊字符(例如德语变音符号(äöü))时,这将始终返回false。有没有办法解决?

非常感谢!

2 个答案:

答案 0 :(得分:1)

Lua及其微小的标准库是平台中立的,并且不知道正确的Windows函数来读取完整的unicode名称。您可以使用winapi模块为此任务获取某些特定于Windows的功能。请注意,它需要在目标磁盘上启用短名称生成。

local handler = io.open( winapi.short_path(filePath) )
if handler then
    -- etc
end

也可以通过LuaRocks轻松安装:luarocks install winapi

答案 1 :(得分:1)

utf8_to_cp1252 = (
   function(cp1252_description)
      local unicode_to_1252 = {}
      for code, unicode in cp1252_description:gmatch'\n0x(%x%x)%s+0x(%x+)' do
         unicode_to_1252[tonumber(unicode, 16)] = tonumber(code, 16)
      end
      local undefined = ('?'):byte()
      return
         function (utf8str)
            local pos, result = 1, {}
            while pos <= #utf8str do
               local code, size = utf8str:byte(pos, pos), 1
               if code >= 0xC0 and code < 0xFE then
                  local mask = 64
                  code = code - 128
                  repeat
                     local next_byte = utf8str:byte(pos+size, pos+size) or 0
                     if next_byte >= 0x80 and next_byte < 0xC0 then
                        code, size = (code - mask - 2) * 64 + next_byte, size+1
                     else
                        code, size = utf8str:byte(pos, pos), 1
                     end
                     mask = mask * 32
                  until code < mask
               end
               pos = pos + size
               table.insert(result, 
                  string.char(unicode_to_1252[code] or undefined))
            end
            return table.concat(result)
         end
   end
)[[
download 
http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1252.TXT
and insert the whole text here:

#
#    Name:     cp1252 to Unicode table
#    Unicode version: 2.0
#    Table version: 2.01
..................................
0xFD    0x00FD  #LATIN SMALL LETTER Y WITH ACUTE
0xFE    0x00FE  #LATIN SMALL LETTER THORN
0xFF    0x00FF  #LATIN SMALL LETTER Y WITH DIAERESIS
]]

用法:

cp1252_filename = utf8_to_cp1252(your_utf8_filename)

现在,您可以使用cp1252_filename来调用io.open()os.rename()os.execute()以及标准Lua库中的其他功能。