我有一个文字: “ Csuklásirohamgyötörhetiasvédeket,annyitemlegetikmostanságismétssvédmodelltMagyarországon。“
在原始文本中根本没有换行符。
当我通过电子邮件发送此文本(使用gmail)时,我将其编码为以下内容:
Content-Type: text/plain; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable
Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket, annyit emlegetik mostans=E1g =
ism=E9t a
sv=E9d modellt Magyarorsz=E1gon.
在HTML中:
Content-Type: text/html; charset=ISO-8859-2
Content-Transfer-Encoding: quoted-printable
<span class=3D"Apple-style-span" style=3D"font-family: Helvetica, Verdana, = sans-serif; font-size: 15px; ">Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket= , annyit emlegetik mostans=E1g ism=E9t a sv=E9d modellt Magyarorsz=E1gon.
...
当我尝试将电子邮件正文解析为text / plain时,我无法摆脱=登录“mostans = E1g = ism = E9t“在两个单词之间。注意HTML编码消息中缺少相同的字符。我不知道该特殊字符可能是什么,但我需要消除它以取回原始文本。< / p>
我试图替换'\ n',但它不是那个,如果我在文本中点击'Enter',我可以正确地将它替换为我想要的任何字符。我也试过'\ r'和'\ t'。
所以问题是,我错过了什么?这个特殊角色来自哪里?是因为charser和/或传输编码?如果是这样,我该怎么做才能解决问题并重新获得原始文本。
欢迎任何帮助。
干杯, 巴拉兹
答案 0 :(得分:3)
您需要使用MimeUtility。以下是一个示例。
public class Mime {
public static void main(String[] args) throws MessagingException,
IOException {
InputStream stringStream = new FileInputStream("mime");
InputStream output = MimeUtility.decode(stringStream,
"quoted-printable");
System.out.println(convertStreamToString(output));
}
public static String convertStreamToString(InputStream is)
throws IOException {
/*
* To convert the InputStream to String we use the Reader.read(char[]
* buffer) method. We iterate until the Reader return -1 which means
* there's no more data to read. We use the StringWriter class to
* produce the string.
*/
if (is != null) {
Writer writer = new StringWriter();
char[] buffer = new char[1024];
try {
Reader reader = new BufferedReader(new InputStreamReader(is,
"ISO8859_1"));
int n;
while ((n = reader.read(buffer)) != -1) {
writer.write(buffer, 0, n);
}
} finally {
is.close();
}
return writer.toString();
} else {
return "";
}
}
}
文件'mime'包含编码文本:
Csukl=E1si roham gy=F6t=F6rheti a sv=E9deket, annyit emlegetik mostans=E1g =
ism=E9t a
sv=E9d modellt Magyarorsz=E1gon.
<强>更新强>
使用Guava库:
InputSupplier<InputStream> supplier = new InputSupplier<InputStream>() {
@Override
public InputStream getInput() throws IOException {
InputStream inStream = new FileInputStream("mime");
InputStream decodedStream=null;
try {
decodedStream = MimeUtility.decode(inStream,
"quoted-printable");
} catch (MessagingException e) {
e.printStackTrace();
}
return decodedStream;
}
};
InputSupplier<InputStreamReader> result = CharStreams
.newReaderSupplier(supplier, Charsets.ISO_8859_1);
String ans = CharStreams.toString(result);
System.out.println(ans);
答案 1 :(得分:2)
传输编码“quoted-printable”禁止编码行超过76个字符的长度。如果要编码的文本包含较长的文本行,则必须插入“软换行符”,其由单个“=”表示为编码行的最后一个字符。这意味着仅插入以下换行符以满足76个字符的限制,并且在解码传输编码时应删除以下换行符。