Question

从某些外部模块，我收到带有unicode符号的数组，无法将其转换为字符串。

   print(data)
   print(type(data))
   print(type(data[0]))
   print(len(data[0]))

输出

   array('B', [99, 100, 99, 100, 99, 100, 99, 100, 99])
   <type 'unicode'>
   <type 'unicode'>
   1

所以我只需要得到一个字符串＆＃39; cdcdcdcdc＆＃39;，但数据类型[0]是unicode，尽管＆＃39; B＆＃39; （int）呈现。我的所有轮胎都完成了错误，或者在打印时我得到了相同的一个数组（不是字符串）。

upd：我试过了

   rets=''
   for i in xrange(len(data)):
       rets += chr(int(ord( data[i] )))
   print( rets )

Answer 1

输出表明数据实际上是这个unicode字符串：

data = u"array('B', [99, 100, 99, 100, 99, 100, 99, 100, 99])"

（很奇怪，但这是唯一的可能性）

ast.litteral_eval无法处理数组，因此您必须使用邪恶的eval在真正的数组中更改它：

from array import array
arr = eval(str(data))

如果数据可能来自外部源，

从不那样做，因为评估不受控制的数据可以允许执行任意代码： CAVEAT EMPTOR

但是一旦完成，arr就是一个很好的无符号字符数组。您可以轻松地将其设为字符串：

''.join([chr(i) for i in b])

无论如何，上面只是一个解决方法。真正的解决方案是找到如何生成这样一个奇怪的字符串并修复它。

Answer 2

你可以使用eval将字符串转换为python对象，在这种情况下是一个列表（unicode可以作为字符串多次处理。所以eval也适用于unicode）。如果data [1]已经是一个列表，只需从print语句中删除eval。

然后使用map将每个int转换为带有chr的字符。然后可以将其设为带有“”.join

的字符串

 print ("".join(map(chr,eval(data[1]))))

输出：

 "cdcdcdcdc"

编辑：在一些陈述中尝试看看它出错的地方

print (eval(data[1])) #check if eval works
print (type(eval(data[1]))) #check if the type that evel returns is a list
print (type(map(chr,eval(data[1]))[0])) #check if the types in the list are strings
print (map(chr,eval(data[1]))) #check if you get a list with strings

将unicode字节数组转换为字符串

2 个答案: