从Webservice返回的Python SUDS unicode解码错误

时间:2012-07-05 17:15:32

标签: python suds

我正在尝试使用我们的开发人员创建的Web服务,允许我们在一定限制内将文件上传到系统中。

使用SUDS,我收到以下信息:

Suds ( https://fedorahosted.org/suds/ ) version: 0.4 GA  build: R699-20100913

Service ( ConnectToEFS ) tns="http://tempuri.org/"
   Prefixes (3)
      ns0 = "http://schemas.microsoft.com/2003/10/Serialization/"
      ns1 = "http://schemas.microsoft.com/Message"
      ns2 = "http://tempuri.org/"
   Ports (1):
      (BasicHttpBinding_IConnectToEFS)
         Methods (2):
            CreateContentFolder(xs:string FileCode, xs:string FolderName, xs:string ContentType, xs:string MetaDataXML, )
            UploadFile(ns1:StreamBody FileByteStream, )
         Types (4):
            ns1:StreamBody
            ns0:char
            ns0:duration
            ns0:guid

我使用UploadFile的方法如下:

def webserviceUploadFile(self, targetLocation, fileName, fileSource):
    fileSource = './test_files/' + fileSource
    ntlm = WindowsHttpAuthenticated(username=uname, password=upass)
    client = Client(webservice_url, transport=ntlm)
    client.set_options(soapheaders={'TargetLocation':targetLocation, 'FileName': fileName})
    body = client.factory.create('AIRDocument')
    body_file = open(fileSource, 'rb')
    body_data = body_file.read()
    body.FileByteStream = body_data
    return client.service.UploadFile(body)

运行它会得到以下结果:

Traceback (most recent call last):
  File "test_cases.py", line 639, in test_upload_file_invalid_extension
    result_string = self.HM.webserviceUploadFile('9999', 'AD-1234-5424__44.exe',
 'test_data.pdf')
  File "test_cases.py", line 81, in webserviceUploadFile
    return client.service.UploadFile(body)
  File "build\bdist.win32\egg\suds\client.py", line 542, in __call__
    return client.invoke(args, kwargs)
  File "build\bdist.win32\egg\suds\client.py", line 595, in invoke
    soapenv = binding.get_message(self.method, args, kwargs)
  File "build\bdist.win32\egg\suds\bindings\binding.py", line 120, in get_message
    content = self.bodycontent(method, args, kwargs)
  File "build\bdist.win32\egg\suds\bindings\document.py", line 63, in bodycontent
    p = self.mkparam(method, pd, value)
  File "build\bdist.win32\egg\suds\bindings\document.py", line 105, in mkparam
    return Binding.mkparam(self, method, pdef, object)
  File "build\bdist.win32\egg\suds\bindings\binding.py", line 287, in mkparam
    return marshaller.process(content)
  File "build\bdist.win32\egg\suds\mx\core.py", line 62, in process
    self.append(document, content)
  File "build\bdist.win32\egg\suds\mx\core.py", line 75, in append
    self.appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 102, in append
    appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 243, in append
    Appender.append(self, child, cont)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 182, in append
    self.marshaller.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\core.py", line 75, in append
    self.appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 102, in append
    appender.append(parent, content)
  File "build\bdist.win32\egg\suds\mx\appender.py", line 198, in append
    child.setText(tostr(content.value))
  File "build\bdist.win32\egg\suds\sax\element.py", line 251, in setText
    self.text = Text(value)
  File "build\bdist.win32\egg\suds\sax\text.py", line 43, in __new__
    result = super(Text, cls).__new__(cls, *args, **kwargs)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 10: ordinal
 not in range(128)

经过大量研究并与网络服务开发人员交谈后,我将body_data = body_file.read()修改为body_data = body_file.read()。decode(“UTF-8”),这让我得到了这个错误:

Traceback (most recent call last):
  File "test_cases.py", line 639, in test_upload_file_invalid_extension
    result_string = self.HM.webserviceUploadFile('9999', 'AD-1234-5424__44.exe', 'test_data.pdf')
  File "test_cases.py", line 79, in webserviceUploadFile
    body_data = body_file.read().decode("utf-8")
  File "C:\python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe2 in position 10: invalid
continuation byte

这不太有帮助。

在对问题进行更多研究之后,我尝试将'errors ='ignore''添加到UTF-8编码中,结果就是这样:

<TransactionDescription>Error in INTL-CONF_France_PROJ_MA_126807.docx: An exception has been thrown when reading the stream.. Inner Exception: System.Xml.XmlException: The byte 0x03 is not valid at this location.  Line 1, position 318.
    at System.Xml.XmlExceptionHelper.ThrowXmlException(XmlDictionaryReader reader, String res, String arg1, String arg2, String arg3)
    at System.Xml.XmlUTF8TextReader.Read()
    at System.ServiceModel.Dispatcher.StreamFormatter.MessageBodyStream.Exhaust(XmlDictionaryReader reader)
    at System.ServiceModel.Dispatcher.StreamFormatter.MessageBodyStream.Read(Byte[] buffer, Int32 offset, Int32 count). Source: System.ServiceModel</TransactionDescription>

这让我更加难过。基于webservice的结果堆栈跟踪,看起来它想要UTF-8,但是我似乎无法在没有Python或SUDS的情况下将其发送到Web服务,或者忽略编码中的问题。我正在处理的系统只接受MicroSoft办公室类型文件(doc,xls等),PDF和TXT文件,因此使用我对编码有更多控制权的东西不是一种选择。我也试过检测样本PDF和样本DOCX使用的编码,但是使用它建议的内容(Latin-1,ISO8859-x和几个windows XXXX)都被Python和SUDS接受,但不是由webservice接受。 / p>

另请注意,在所示示例中,最常见的是将测试引用到无效扩展名。这个错误甚至适用于成功上传的测试,这是最后一次堆栈跟踪真正出现的唯一时间。

1 个答案:

答案 0 :(得分:2)

您可以使用此base64.b64encode(body_file.read()),这将返回base64字符串值。所以你的请求变量必须是一个字符串。