我有一个带有message
字符串的IMAP邮件,如下所示:
message = #<Mail::Message:70152447148720, Multipart: false, Headers: <Return-Path: <apache@mail.gameseek.co.uk>>, <Received: by 10.86.68.12 with SMTP id q12cs352558fga; Mon, 9 Mar 2009 04:23:05 -0700 (PDT)>, <Received: by 10.210.137.14 with SMTP id k14mr2429643ebd.46.1236597783700; Mon, 09 Mar 2009 04:23:03 -0700 (PDT)>, <Received: from exproxy-2.exserver.dk (exproxy-2.exserver.dk [195.69.129.163]) by mx.google.com with ESMTP id 27si3500694ewy.75.2009.03.09.04.23.03; Mon, 09 Mar 2009 04:23:03 -0700 (PDT)>, <Received: by exproxy-2.exserver.dk (Postfix, from userid 65534) id DF2F6106EF3; Mon, 9 Mar 2009 12:13:26 +0100 (CET)>, <Received: from exsmtp01.exserver.dk (exsmtp01.exserver.dk [195.69.129.177]) by exproxy-2.exserver.dk (Postfix) with ESMTP id C2CEE106ED0 for <support_email.com@exfwd01.scannet.dk>; Mon, 9 Mar 2009 12:13:26 +0100 (CET)>, <Received: from exsmtp02.exserver.dk ([10.10.10.32]) by exsmtp01.exserver.dk with Microsoft SMTPSVC(6.0.3790.1830); Mon, 9 Mar 2009 12:22:19 +0100>, <Received: from front08.exserver.dk ([195.69.129.93]) by exsmtp02.exserver.dk with Microsoft SMTPSVC(6.0.3790.1830); Mon, 9 Mar 2009 12:22:19 +0100>, <Received: from localhost (front08.exserver.dk [127.0.0.1]) by front08.exserver.dk (Postfix) with ESMTP id F1B2BC4028 for <support@email.com>; Mon, 9 Mar 2009 12:46:22 +0100 (CET)>, <Received: from front08.exserver.dk ([127.0.0.1]) by localhost (front08.exserver.dk [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mrYGo4G2pt13 for <support@email.com>; Mon, 9 Mar 2009 12:46:16 +0100 (CET)>, <Received: from mail.gameseek.co.uk (78.109.164.42.srvlist.ukfast.net [78.109.164.42]) by front08.exserver.dk (Postfix) with ESMTP id 99022C4021 for <support@email.com>; Mon, 9 Mar 2009 12:46:16 +0100 (CET)>, <Received: by mail.gameseek.co.uk (Postfix, from userid 48) id D321218DD2D; Mon, 9 Mar 2009 11:22:55 +0000 (GMT)>, <Date: Mon, 09 Mar 2009 11:22:55 +0000>, <From: myorder@gameseek.co.uk>, <Reply-To: myorder@gameseek.co.uk>, <To: support@email.com>, <Message-ID: <20090309112255.D321218DD2D@mail.gameseek.co.uk>>, <Subject: Gameseek Order Refunded: Gh68y1235386413>, <Delivered-To: my@email.com>, <Received-SPF: neutral (google.com: 195.69.129.163 is neither permitted nor denied by best guess record for domain of apache@mail.gameseek.co.uk) client-ip=195.69.129.163;>, <Authentication-Results: mx.google.com; spf=neutral (google.com: 195.69.129.163 is neither permitted nor denied by best guess record for domain of apache@mail.gameseek.co.uk) smtp.mail=apache@mail.gameseek.co.uk>, <X-Exserver-To: support_email.com@exfwd01.scannet.dk>, <X-Virus-Scanned: amavisd-new at exserver.dk>, <X-OriginalArrivalTime: 09 Mar 2009 11:22:19.0838 (UTC) FILETIME=[4F6005E0:01C9A0A9]>, <X-ScanNet-Forward: TTL=5>>
我现在希望给它一个正确的编码:
unless message.multipart?
charset = message.charset # => "UTF-8"
if charset != nil
body = message.body.decoded.force_encoding(charset).encode("UTF-8") # => "\n\nHello you,\n\nYour order or part of it has been refunded by Gameseek. The refund will be present on the same payment method you used when purchasing. If no other items are due to be posted to you the postage charge will also be refunded.\n\nPlease allow upto four working days for this refund to process.\n\nIf you have not contacted us about this order then it is most likely you are being refunded for an item we cannot currently get hold of.\n\nWe do apologise if this is the case, we would rather refund customers rather than having them wait weeks and weeks for an item.\n\nIf you have contacted us about this order then you will know why you are being refunded.\nMay we apologise if we have not met your requirements on this occassion.\n\nYour Order: Product | Category | Quantity | Cost\n---------------------------------------------------\nDragon Ball Z - Supersonic Warriors 2 | NintendoDS | 1 | \xA326.97\n\n\nFor all order enquires please contact myorder@gameseek.co.uk\n\nThank you for using Gameseek.\n"
end
end
body = body.split(/Sent from my iPhone/)[0]
最后一行引发以下错误:
invalid byte sequence in UTF-8
知道如何解决此问题吗?
答案 0 :(得分:0)
文本包含无效序列\xA3
。这代表拉丁语-1(ISO-8859-1)的英镑符号。
"\xA3".force_encoding('ISO-8859-1').encode('UTF-8')
#=> "£"
快速解决方法是将body
中的无效字节序列替换为String#scrub
,但这会将其删除:
"\xA326.97".scrub('')
#=> "26.97"
但是,要解决“真正的”问题,您应该先在管道中查看。提供的字符集似乎是错误的。显然,消息是用Latin-1编码的,尽管charset表示不同的东西。也许问题出在发件人身上。