如何在ANT中支持3字节UTF-8字符

时间:2012-06-18 19:38:04

标签: ant utf-8

我正在尝试在ANT脚本中支持UTF-8字符。

只要字符串由2字节UTF-8字符组成,例如:

  • Lògìñ
  • ÙsèrÌÐ

然后事情很好。

当我使用Unicode Han Character时:

根据这个网站: http://www.fileformat.info/info/unicode/char/6211/index.htm 具有0xE6 0x88 0x91

的UTF-8编码

我可以在UltraEdit中看到,我的输入属性文件连续都有值“E6 88 91”,所以我相信我的输入是正确的。当我在Notepad ++中打开相同的文件时,我可以正确地看到所有字符。

这是我的构建脚本:

<?xml version="1.0" encoding="UTF-8" ?>

<project
    name="utf8test"
    default="all"
    basedir=".">

    <target name="all">
        <loadproperties  encoding="UTF-8" srcfile="./apps.properties.all.txt"  />

        <echo>No encoding ${common.app.name}</echo>
        <echo encoding="UTF-8">UTF-8 ${common.app.name}</echo>
        <echo encoding="UnicodeLittle">UnicodeLittle ${common.app.name}</echo>
        <echo encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ${common.app.name}</echo>
        <echo>${common.app.ServerName}</echo>
        <echo>${bb.vendor}</echo>

        <echo>No encoding ${common.app.UserIdText}</echo>
        <echo encoding="UTF-8">UTF-8 ${common.app.UserIdText}</echo>
        <echo encoding="UnicodeLittle">UnicodeLittle ${common.app.UserIdText}</echo>
        <echo encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked ${common.app.UserIdText}</echo>

        <echoproperties />
      </target>
    </project>

这是我的属性文件:

common.app=VrvPsLTst
common.app.name=我們
common.app.description=Pseudo Loc Test App for Build Script testing
common.app.ServerName=http://Vèrìvò.com
bb.vendor=Vèrìvò
common.app.PasswordText=Pàsswòrð
bb.override.list=MP_COPYRIGHTTEXT, "Çòpÿrìght 2012 Vèrívó Bùîlð TéàM"
common.app.LoginButtonText=Lògìñ
common.app.UserIdText=Ùsèr ÌÐ
bb.SMSSuccess=Mèssàgéß Sùççêssfúllÿ Sëñt
common.app.LoginScreenMessage=WèlçòMé Mêssàgë
common.app.LoginProgressMessage=Àùthèñtìçàtíòñ îñ prógréss...
ios.RegistrationText=Règìstràtíòñ Téxt
ios.RegistrationURL=http://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/

以下是输出结果:

Buildfile: C:\Temp\utf8\build.xml

all:
     [echo] No encoding ??
     [echo] UTF-8 ??
     [echo] ÿþU n i c o d e L i t t l e   ? ? 
     [echo] U n i c o d e L i t t l e U n m a r k e d   ? ? 
     [echo] http://Vèrìvò.com
     [echo] Vèrìvò
     [echo] No encoding Ùsèr ÌÐ
     [echo] UTF-8 Ùsèr Ì�
     [echo] ÿþU n i c o d e L i t t l e   Ù s è r   Ì Ð 
     [echo] U n i c o d e L i t t l e U n m a r k e d   Ù s è r   Ì Ð 
[echoproperties] #Ant properties
[echoproperties] #Mon Jun 18 15:25:13 EDT 2012
[echoproperties] ant.core.lib=C\:\\ant\\lib\\ant.jar
[echoproperties] ant.file=C\:\\Temp\\utf8\\build.xml
[echoproperties] ant.file.type=file
[echoproperties] ant.file.type.utf8test=file
[echoproperties] ant.file.utf8test=C\:\\Temp\\utf8\\build.xml
[echoproperties] ant.home=c\:\\ant\\bin\\..
[echoproperties] ant.java.version=1.6
[echoproperties] ant.library.dir=C\:\\ant\\lib
[echoproperties] ant.project.default-target=all
[echoproperties] ant.project.invoked-targets=all
[echoproperties] ant.project.name=utf8test
[echoproperties] ant.version=Apache Ant version 1.8.1 compiled on April 30 2010
[echoproperties] awt.toolkit=sun.awt.windows.WToolkit
[echoproperties] basedir=C\:\\Temp\\utf8
[echoproperties] bb.SMSSuccess=M\u00E8ss\u00E0g\u00E9\u00DF S\u00F9\u00E7\u00E7\u00EAssf\u00FAll\u00FF S\u00EB\u00F1t
[echoproperties] bb.override.list=MP_COPYRIGHTTEXT, "\u00C7\u00F2p\u00FFr\u00ECght 2012 V\u00E8r\u00EDv\u00F3 B\u00F9\u00EEl\u00F0 T\u00E9\u00E0?"
[echoproperties] bb.vendor=V\u00E8r\u00ECv\u00F2
[echoproperties] common.app=VrvPsLTst
[echoproperties] common.app.LoginButtonText=L\u00F2g\u00EC\u00F1
[echoproperties] common.app.LoginProgressMessage=\u00C0\u00F9th\u00E8\u00F1t\u00EC\u00E7\u00E0t\u00ED\u00F2\u00F1 \u00EE\u00F1 pr\u00F3gr\u00E9ss...
[echoproperties] common.app.LoginScreenMessage=W\u00E8l\u00E7\u00F2?\u00E9 M\u00EAss\u00E0g\u00EB
[echoproperties] common.app.PasswordText=P\u00E0ssw\u00F2r\u00F0
[echoproperties] common.app.ServerName=http\://V\u00E8r\u00ECv\u00F2.com
[echoproperties] common.app.UserIdText=\u00D9s\u00E8r \u00CC\u00D0
[echoproperties] common.app.description=Pseudo Loc Test App for Build Script testing
[echoproperties] common.app.name=??
[echoproperties] file.encoding=Cp1252
[echoproperties] file.encoding.pkg=sun.io
[echoproperties] file.separator=\\
[echoproperties] ios.RegistrationText=R\u00E8g\u00ECstr\u00E0t\u00ED\u00F2\u00F1 T\u00E9xt
[echoproperties] ios.RegistrationURL=http\://www.josscrowcroft.com/2011/code/utf-8-multibyte-characters-in-url-parameters-%E2%9C%93/
[echoproperties] java.awt.graphicsenv=sun.awt.Win32GraphicsEnvironment
[echoproperties] java.awt.printerjob=sun.awt.windows.WPrinterJob
[echoproperties] java.class.path=c\:\\ant\\bin\\..\\lib\\ant-launcher.jar;C\:\\Temp\\utf8\\.\\;C\:\\Program Files (x86)\\Java\\jre7\\lib\\ext\\QTJava.zip;C\:\\ant\\lib\\ant-antlr.jar;C\:\\ant\\lib\\ant-apache-bcel.jar;C\:\\ant\\lib\\ant-apache-bsf.jar;C\:\\ant\\lib\\ant-apache-log4j.jar;C\:\\ant\\lib\\ant-apache-oro.jar;C\:\\ant\\lib\\ant-apache-regexp.jar;C\:\\ant\\lib\\ant-apache-resolver.jar;C\:\\ant\\lib\\ant-apache-xalan2.jar;C\:\\ant\\lib\\ant-commons-logging.jar;C\:\\ant\\lib\\ant-commons-net.jar;C\:\\ant\\lib\\ant-contrib-1.0b3.jar;C\:\\ant\\lib\\ant-jai.jar;C\:\\ant\\lib\\ant-javamail.jar;C\:\\ant\\lib\\ant-jdepend.jar;C\:\\ant\\lib\\ant-jmf.jar;C\:\\ant\\lib\\ant-jsch.jar;C\:\\ant\\lib\\ant-junit.jar;C\:\\ant\\lib\\ant-launcher.jar;C\:\\ant\\lib\\ant-netrexx.jar;C\:\\ant\\lib\\ant-nodeps.jar;C\:\\ant\\lib\\ant-starteam.jar;C\:\\ant\\lib\\ant-stylebook.jar;C\:\\ant\\lib\\ant-swing.jar;C\:\\ant\\lib\\ant-testutil.jar;C\:\\ant\\lib\\ant-trax.jar;C\:\\ant\\lib\\ant-weblogic.jar;C\:\\ant\\lib\\ant.jar;C\:\\ant\\lib\\bb-ant-tools.jar;C\:\\ant\\lib\\xercesImpl.jar;C\:\\ant\\lib\\xml-apis.jar;C\:\\Program Files\\Java\\jre7\\lib\\tools.jar
[echoproperties] java.class.version=51.0
[echoproperties] java.endorsed.dirs=C\:\\Program Files\\Java\\jre7\\lib\\endorsed
[echoproperties] java.ext.dirs=C\:\\Program Files\\Java\\jre7\\lib\\ext;C\:\\Windows\\Sun\\Java\\lib\\ext
[echoproperties] java.home=C\:\\Program Files\\Java\\jre7
[echoproperties] java.io.tmpdir=C\:\\Users\\efelton\\AppData\\Local\\Temp\\
[echoproperties] java.library.path=C\:\\Windows\\SYSTEM32;C\:\\Windows\\Sun\\Java\\bin;C\:\\Windows\\system32;C\:\\Windows;C\:\\Windows\\SYSTEM32;C\:\\Windows;C\:\\Windows\\SYSTEM32\\WBEM;C\:\\Windows\\SYSTEM32\\WINDOWSPOWERSHELL\\V1.0\\;C\:\\PROGRAM FILES\\INTEL\\WIFI\\BIN\\;C\:\\PROGRAM FILES\\COMMON FILES\\INTEL\\WIRELESSCOMMON\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\;C\:\\PROGRAM FILES\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\;C\:\\PROGRAM FILES\\MICROSOFT SQL SERVER\\100\\DTS\\BINN\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\TOOLS\\BINN\\VSSHELL\\COMMON7\\IDE\\;C\:\\PROGRAM FILES (X86)\\MICROSOFT SQL SERVER\\100\\DTS\\BINN\\;C\:\\Program Files\\ThinkPad\\Bluetooth Software\\;C\:\\Program Files\\ThinkPad\\Bluetooth Software\\syswow64;C\:\\Program Files (x86)\\QuickTime\\QTSystem\\;C\:\\Program Files (x86)\\AccuRev\\bin;C\:\\Program Files\\Java\\jdk1.7.0_04\\bin;C\:\\Program Files (x86)\\IDM Computer Solutions\\UltraEdit\\;.
[echoproperties] java.runtime.name=Java(TM) SE Runtime Environment
[echoproperties] java.runtime.version=1.7.0_04-b22
[echoproperties] java.specification.name=Java Platform API Specification
[echoproperties] java.specification.vendor=Oracle Corporation
[echoproperties] java.specification.version=1.7
[echoproperties] java.vendor=Oracle Corporation
[echoproperties] java.vendor.url=http\://java.oracle.com/
[echoproperties] java.vendor.url.bug=http\://bugreport.sun.com/bugreport/
[echoproperties] java.version=1.7.0_04
[echoproperties] java.vm.info=mixed mode
[echoproperties] java.vm.name=Java HotSpot(TM) 64-Bit Server VM
[echoproperties] java.vm.specification.name=Java Virtual Machine Specification
[echoproperties] java.vm.specification.vendor=Oracle Corporation
[echoproperties] java.vm.specification.version=1.7
[echoproperties] java.vm.vendor=Oracle Corporation
[echoproperties] java.vm.version=23.0-b21
[echoproperties] line.separator=\r\n
[echoproperties] os.arch=amd64
[echoproperties] os.name=Windows 7
[echoproperties] os.version=6.1
[echoproperties] path.separator=;
[echoproperties] sun.arch.data.model=64
[echoproperties] sun.boot.class.path=C\:\\Program Files\\Java\\jre7\\lib\\resources.jar;C\:\\Program Files\\Java\\jre7\\lib\\rt.jar;C\:\\Program Files\\Java\\jre7\\lib\\sunrsasign.jar;C\:\\Program Files\\Java\\jre7\\lib\\jsse.jar;C\:\\Program Files\\Java\\jre7\\lib\\jce.jar;C\:\\Program Files\\Java\\jre7\\lib\\charsets.jar;C\:\\Program Files\\Java\\jre7\\lib\\jfr.jar;C\:\\Program Files\\Java\\jre7\\classes
[echoproperties] sun.boot.library.path=C\:\\Program Files\\Java\\jre7\\bin
[echoproperties] sun.cpu.endian=little
[echoproperties] sun.cpu.isalist=amd64
[echoproperties] sun.desktop=windows
[echoproperties] sun.io.unicode.encoding=UnicodeLittle
[echoproperties] sun.java.command=org.apache.tools.ant.launch.Launcher -cp .;C\:\\Program Files (x86)\\Java\\jre7\\lib\\ext\\QTJava.zip
[echoproperties] sun.java.launcher=SUN_STANDARD
[echoproperties] sun.jnu.encoding=Cp1252
[echoproperties] sun.management.compiler=HotSpot 64-Bit Tiered Compilers
[echoproperties] sun.os.patch.level=Service Pack 1
[echoproperties] user.country=US
[echoproperties] user.dir=C\:\\Temp\\utf8
[echoproperties] user.home=C\:\\Users\\efelton
[echoproperties] user.language=en
[echoproperties] user.name=efelton
[echoproperties] user.script=
[echoproperties] user.timezone=
[echoproperties] user.variant=

BUILD SUCCESSFUL
Total time: 1 second

感谢您的帮助

编辑\更新6/19/2012

我正在Windows环境中开发。

我从以下位置安装了TTF: http://freedesktop.org/wiki/Software/CJKUnifonts/Download

我更新了UltraEdit以使用TTF,我可以看到中文字符。

<?xml version="1.0" encoding="UTF-8" ?>

<project name="utf8test" default="all" basedir="."> 

   <target name="all">        

      <echo>我們</echo>
      <echo encoding="ISO-8859-1">ISO-8859-1 我們</echo> 
      <echo encoding="UTF-8">UTF-8 我們</echo> 


      <echo file="echo_output.txt" append="true" >我們 ${line.separator}</echo>
      <echo file="echo_output.txt" append="true"  encoding="ISO-8859-1">ISO-8859-1 我們 ${line.separator}</echo> 
      <echo file="echo_output.txt" append="true"  encoding="UTF-8">UTF-8 我們 ${line.separator}</echo> 
      <echo file="echo_output.txt" append="true"  encoding="UnicodeLittle">UnicodeLittle 我們 ${line.separator}</echo> 
      <echo file="echo_output.txt" append="true"  encoding="UnicodeLittleUnmarked">UnicodeLittleUnmarked 我們 ${line.separator}</echo> 

   </target> 
</project> 

在UltraEdit中运行捕获的输出是:         构建文件:E:\ temp \ utf8 \ build.xml

    all:
         [echo] ??
         [echo] ISO-8859-1 ??
         [echo] UTF-8 ??

    BUILD SUCCESSFUL
    Total time: 1 second

echo_output.txt文件显示如下:

    ?? 
    ISO-8859-1 ?? 
    UTF-8 ?? 
    ÿþU n i c o d e L i t t l e   ? ?   

     U n i c o d e L i t t l e U n m a r k e d   ? ?   

因此,我的ANT环境设置方式似乎存在一些根本性的错误,因为我不能简单地将字符回显到屏幕或文件中。

2 个答案:

答案 0 :(得分:0)

java.util.Properites类使用ISO 8859-1编码。使用Ant 1.8.2进行测试时,以下工作正常。

<强>的build.xml

<?xml version="1.0" encoding="UTF-8" ?>
<project name="utf8test" default="all" basedir=".">

<target name="all">
  <loadproperties encoding="ISO-8859-1" srcfile="./apps.properties.all.txt"  />

  <echo>No encoding ${common.app.name}</echo>
  <echo encoding="ISO-8859-1">ISO-8859-1 ${common.app.name}</echo>
</target>
</project>

<强>输出

all:
     [echo] No encoding æå
     [echo] ISO-8859-1 我們

BUILD SUCCESSFUL

答案 1 :(得分:0)

我通过首先通过此编码器传递所有输入(属性文件)来解决我在Windows和MacOS上的问题。然后,当输入以这种方式转义时,ANT可以正确读取,然后写入值。

 StringBuffer buffer = new StringBuffer();
 try 
 {
            FileInputStream fis = new FileInputStream(input);
            InputStreamReader isr = new InputStreamReader(fis, "UTF8");
            Reader in = new BufferedReader(isr);    
            int ch;
            while ((ch = in.read()) > -1)
            {
                if (ch > 127 || ch < 0) 
                {
                    String hex = Integer.toHexString(ch);
                    switch (hex.length()) 
                    {
                        case 1:
                        buffer.append("\\u000");
                        break;
                        case 2:
                        buffer.append("\\u00");
                        break;
                        case 3:
                        buffer.append("\\u0");
                        break;
                        case 4:
                        default:
                        buffer.append("\\u");
                        break;
                    }
                    buffer.append(hex);
                } 
                else if (ch != 0) 
                {
                    buffer.append((char) ch);
                }
            }//while
            in.close();

        //System.out.println(buffer.toString());

    }//try
    catch (IOException e) 
    {
            //e.printStackTrace();
            throw new BuildException(e.getMessage() );
    }
try 
    {
        FileOutputStream fos = new FileOutputStream(dest);
        Writer out = new OutputStreamWriter(fos, "windows-1252");
        out.write(buffer.toString());
        out.close();
    }
        catch (IOException e) 
    {
        throw new BuildException(e.getMessage() );
    }